Solvedairflow Migrate from 2.1.4 to 2.2.0

Apache Airflow version

2.2.0

Operating System

Linux

Versions of Apache Airflow Providers

default.

Deployment

Docker-Compose

Deployment details

Using airflow-2.2.0python3.7

What happened

Upgrading image from apache/airflow:2.1.4-python3.7
to apache/airflow:2.2.0-python3.7
Cause this inside scheduler, which is not starting:

Python version: 3.7.12
Airflow version: 2.2.0
Node: 6dd55b0a5dd7
-------------------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/airflow/.local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1277, in _execute_context
    cursor, statement, parameters, context
  File "/home/airflow/.local/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 608, in do_execute
    cursor.execute(statement, parameters)
psycopg2.errors.UndefinedColumn: column dag.max_active_tasks does not exist
LINE 1: ..., dag.schedule_interval AS dag_schedule_interval, dag.max_ac...
                                                             ^

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/airflow/.local/lib/python3.7/site-packages/flask/app.py", line 2447, in wsgi_app
    response = self.full_dispatch_request()
  File "/home/airflow/.local/lib/python3.7/site-packages/flask/app.py", line 1952, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/home/airflow/.local/lib/python3.7/site-packages/flask/app.py", line 1821, in handle_user_exception
    reraise(exc_type, exc_value, tb)
  File "/home/airflow/.local/lib/python3.7/site-packages/flask/_compat.py", line 39, in reraise
    raise value
  File "/home/airflow/.local/lib/python3.7/site-packages/flask/app.py", line 1950, in full_dispatch_request
    rv = self.dispatch_request()
  File "/home/airflow/.local/lib/python3.7/site-packages/flask/app.py", line 1936, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/home/airflow/.local/lib/python3.7/site-packages/airflow/www/auth.py", line 51, in decorated
    return func(*args, **kwargs)
  File "/home/airflow/.local/lib/python3.7/site-packages/airflow/www/views.py", line 588, in index
    filter_dag_ids = current_app.appbuilder.sm.get_accessible_dag_ids(g.user)
  File "/home/airflow/.local/lib/python3.7/site-packages/airflow/www/security.py", line 377, in get_accessible_dag_ids
    return {dag.dag_id for dag in accessible_dags}
  File "/home/airflow/.local/lib/python3.7/site-packages/sqlalchemy/orm/query.py", line 3535, in __iter__
    return self._execute_and_instances(context)
  File "/home/airflow/.local/lib/python3.7/site-packages/sqlalchemy/orm/query.py", line 3560, in _execute_and_instances
    result = conn.execute(querycontext.statement, self._params)
  File "/home/airflow/.local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1011, in execute
    return meth(self, multiparams, params)
  File "/home/airflow/.local/lib/python3.7/site-packages/sqlalchemy/sql/elements.py", line 298, in _execute_on_connection
    return connection._execute_clauseelement(self, multiparams, params)
  File "/home/airflow/.local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1130, in _execute_clauseelement
    distilled_params,
  File "/home/airflow/.local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1317, in _execute_context
    e, statement, parameters, cursor, context
  File "/home/airflow/.local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1511, in _handle_dbapi_exception
    sqlalchemy_exception, with_traceback=exc_info[2], from_=e
  File "/home/airflow/.local/lib/python3.7/site-packages/sqlalchemy/util/compat.py", line 182, in raise_
    raise exception
  File "/home/airflow/.local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1277, in _execute_context
    cursor, statement, parameters, context
  File "/home/airflow/.local/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 608, in do_execute
    cursor.execute(statement, parameters)
sqlalchemy.exc.ProgrammingError: (psycopg2.errors.UndefinedColumn) column dag.max_active_tasks does not exist
LINE 1: ..., dag.schedule_interval AS dag_schedule_interval, dag.max_ac...
                                                             ^

[SQL: SELECT dag.dag_id AS dag_dag_id, dag.root_dag_id AS dag_root_dag_id, dag.is_paused AS dag_is_paused, dag.is_subdag AS dag_is_subdag, dag.is_active AS dag_is_active, dag.last_parsed_time AS dag_last_parsed_time, dag.last_pickled AS dag_last_pickled, dag.last_expired AS dag_last_expired, dag.scheduler_lock AS dag_scheduler_lock, dag.pickle_id AS dag_pickle_id, dag.fileloc AS dag_fileloc, dag.owners AS dag_owners, dag.description AS dag_description, dag.default_view AS dag_default_view, dag.schedule_interval AS dag_schedule_interval, dag.max_active_tasks AS dag_max_active_tasks, dag.max_active_runs AS dag_max_active_runs, dag.has_task_concurrency_limits AS dag_has_task_concurrency_limits, dag.next_dagrun AS dag_next_dagrun, dag.next_dagrun_data_interval_start AS dag_next_dagrun_data_interval_start, dag.next_dagrun_data_interval_end AS dag_next_dagrun_data_interval_end, dag.next_dagrun_create_after AS dag_next_dagrun_create_after 
FROM dag]
(Background on this error at: http://sqlalche.me/e/13/f405)

What you expected to happen

Automatic database migration and properly working scheduler.

How to reproduce

Ugrade from 2.1.4 to 2.2.0 with some dags history.

Anything else

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

33 Answers

✔️Accepted Answer

SQL we used from our deployment playbook to clean up these tables

Obviously a disclaimer on running this blindly. Validate its rows you can safely remove first by just executing the CTEs

BEGIN;

-- Remove dag runs without a valid run_id
DELETE FROM dag_run WHERE run_id is NULL;

-- Remove task fails without a run_id
WITH task_fails_to_remove AS (
  SELECT 
    task_fail.dag_id,
    task_fail.task_id,
    task_fail.execution_date
  FROM
    task_fail
  LEFT JOIN 
    dag_run ON 
    dag_run.dag_id = task_fail.dag_id 
    AND dag_run.execution_date = task_fail.execution_date
  WHERE
    dag_run.run_id IS NULL
)
DELETE FROM
    task_fail
USING 
    task_fails_to_remove
WHERE (
    task_fail.dag_id = task_fails_to_remove.dag_id
    AND task_fail.task_id = task_fails_to_remove.task_id
    AND task_fail.execution_date = task_fails_to_remove.execution_date
);

-- Remove task instances without a run_id
WITH task_instances_to_remove AS (
  SELECT
    task_instance.dag_id,
    task_instance.task_id,
    task_instance.execution_date
  FROM
    task_instance
  LEFT JOIN 
    dag_run 
    ON dag_run.dag_id = task_instance.dag_id
    AND dag_run.execution_date = task_instance.execution_date
  WHERE 
    dag_run.run_id is NULL
)
DELETE FROM 
    task_instance
USING
    task_instances_to_remove
WHERE (
    task_instance.dag_id = task_instances_to_remove.dag_id
    AND task_instance.task_id = task_instances_to_remove.task_id
    AND task_instance.execution_date = task_instances_to_remove.execution_date
);

COMMIT;

Other Answers:

Seems other people have similar issue #18912 so for me it looks highly unlikely this was manually added.

I think the number of reports we have in such short time (I saw 3 reports already) indicate that those rows can appear frequently as result of normal operations by Airflow, and many users might hve similar issues soon.

If this is realy result of "regular" airflow behaviour, for me it calls for a very quick 2.2.1 with improved migration to handle that case (cc: @kaxil @jedcunningham)

Problem is that there is no clue, which rows need to be deleted. Also there is no straight logic behind "corresponding rows" for me.

This part of code for me is the only source of some guidelines.

If there is better way to deal with this kind of problem in future, I will appreciate to hear it :)

If some references are lost, this kind of error should be repaired automatically IHMO.

What's the reason for those rows to appear in the DB ? How did they get there? Do we know it?

I think there are two possible scenarios:

  1. If it is result of normal operation of Airflow ( even some some special circumstances, manually triggered dags using CLI or UI)

If that's the case, I agree it should be handled better - Airflow cleanig them or adding "fake" run_ids for those during migration is what I would expect as we should handle this as "regular" migration scenario.

  1. They are result of user adding manually entries to the database.

If this is the case then the BEST we could do is to spit-out the exact SQL query to run to delete those rows. We should not delete them automaticaly.

Case 1. Yes, I'm triggering dags manually from UI or by RestApi.
Case 2. That was very first time I had to work with Airflow Database. I have never add or modified it manually.
I'm only reading connection from db.

Case 2. That was very first time I had to work with Airflow Database. I have never add or modified it manually.
I'm only reading connection from db.

Thanks for confirming @sbialkowski-pixel! @kaxil @jedcunningham I think we need to seriously consider 2.2.1

Related Issues:

27
airflow Negsignal.SIGKILL error on macOS
It's not completely related but I had a different error message and Google lead me here ...
15
airflow Migrate from 2.1.4 to 2.2.0
SQL we used from our deployment playbook to clean up these tables Obviously a disclaimer on running ...
13
airflow Airflow webserver not starting with SQLAlchemy==1.3.16
The problem seems to be SQLAlchemy version 1.3.16 which was released 8 hours ago i downgraded to 1.3...
11
airflow scheduler gets stuck without a trace
We just saw this on 2.0.1 when we added a largish number of new DAGs (We're adding around 6000 DAGs ...
11
airflow Add Production-ready docker compose for the production image
I have prepared some Dockerfiles with some common configuration Description In order to use the prod...
32
argo workflows Argo *namespace-install* raises failed to save outputs: Failed to establish pod watch: unknown (get pods)
This is a great answer man Checklist: [ X ] I've included the version [ X ] I've included reproducti...
4
argo workflows Default Archive Location Error
Configuration:\nartifactRepository: {}\nfeatureFlags: {}\nmetricsConfig:\n disableLegacy: false\npod...
117
superset Was unable to import superset Error: cannot import name '_maybe_box_datetimelike'
tx @Uneasy-listening !!! This worked for me: [only necessary if you have already installed pandas (p...
20
superset Error logging in to superset (sqlalchemy.exc.ProgrammingError)
I share the following in hopes that my struggle will help someone else After battling with this issu...
16
goaccess Support structured log formats such as JSON
Native JSON support has been added Feel free to build from development to test this out It will be p...
14
superset How to use “email report”?
@sandy-ha please run superset init again in terminal because FAB(FlaskAppbuilder) will initialize re...
11
superset Filters as a native dashboard (v2) construct
Due to our changing priorities we have been unable to complete the dashboard level filters feature ...
11
superset How to have multiple filters on dashboard affecting different charts?
Oh you're right There's currently no way to express that We could add a feature that would introduce...
8
goaccess Web access returns 400 Invalid Request
@allinurl I just experienced this issue as well I believe it occurs when the total size of the reque...
6
goaccess GoAccess debian repository install does not work on Stretch (libssl1.0.2)
same for Debian 10!! Debian 4.19.67-2+deb10u1 (2019-09-20) x86_64 GNU/Linux Hi Due to unmet dependen...
5
goaccess goaccess not running in --real-time-html
Glad that worked :) Usually it's best to use the public IP with --ws-url rather than the hostname ju...
5
devilbox How to add subdomains?
Hi @mrpsiho @mavimedia I just double checked it locally and it seems to work fine ...
4
superset PyArrow broke the build on ARM platform [Regression]
I have the exact same issue on Windows 10 64-bit pip install apache-superset==0.35.1 on Raspbian (Ra...
3
goaccess How can I pass custom nginx log format to GoAccess
@xxxatt For the sample line you posted above this works for me Please make sure you are using the la...
3
superset docker-compose failed to build
Setup/Environment: Windows 10 + Docker Desktop + ~ 6 GB of RAM given to Docker (can be configured in...
3831
axios Axios catch error returns javascript error not server response
I have exactly the same environment Try this: Modify from console.log(error) to console.log(error.re...
731
scrapy ' error: command 'x86_64-linux-gnu-gcc' failed with exit status 1 '
@euler16 for scrapy with Python 3 you'll need with Python 2 you'll need I wanted to install scrapy i...
684
laradock Mysql. The server requested authentication method unknown to the client [caching_sha2_password]
alter user 'username'@'localhost' identified with mysql_native_password by 'password'; would fix it....
517
react navigation screenIsActive prop / componentDidFocus event for TabNavigator items
It probably makes sense to add lifecycle hooks to screens In one of my Tabs i need to load Data from...
474
meteor [1.4.2.1] Error: ENFILE: file table overflow
I was getting the same after an upgrade to macOS Sierra Turns out macOS have a harsh limit on number...
423
ipython Last jedi release (0.18.0) is incompatible with ipython (7.19 and 7.18 tested); reason - column arg was deprecated, and now removed
As a temporary fix for anyone just trying to get things working again: It would be really nice if yo...
397
material ui Module not found: Can't resolve 'material-ui-icons/Menu' Martial Next
For anyone else experiencing this issue: npm install @material-ui/icons https://www.npmjs.com/packag...
395
laravel dompdf (1/1) ErrorException Non-static method Barryvdh\DomPDF\PDF::loadView() should not be called statically
This happens because you are namespacing the wrong PDF class You are namespacing Barryvdh\DomPDF\PDF...
378
webpacker localIdentName option moved in css-loader configuration
I faced same issue after upading css-loader but I solved it If you check css-loader readme ...
364
react navigation Send data back from child screen?
@itswaze You can do something along these lines to pass back from the child screen ...
358
react navigation How to goBack from nested StackNavigator?
@dhruvparmar372 According to the NOTE in the doc a navigator's navigation prop may not have the help...
352
axios POST request works in Browser but not on Node
This might be considered a duplicate of #789 I was able to use the form-data package with Axios in n...
317
react navigation Best pattern for a 'Save' button in the header
Try setting your component instance's handleSave function as a navigation state parameter after the ...
310
DefinitelyTyped [@types/react] RefObject.current should no longer be readonly
It's not It'a intentionally left readonly to ensure correct usage even if it's not frozen ...
306
react native navigation [V3][Android] FATAL EXCEPTION: create_react_context
OK after a good night of sleep I've found why I was having this issue In the MainApplication.java I ...
303
ts node Custom typings not working with ts-node 8.0.2
When using with ts-node you have to add --files flag After updating to ts-node version 8.0.2 the cus...
294
ohmyzsh compinit:503: no such file or directory: /usr/local/share/zsh/site-functions/_brew
Per #9602 (comment) brew cleanup fixed it for me I am using Apple M1 When I added this line: export ...
292
laradock SQLSTATE[HY000] [2054] The server requested authentication method unknown to the client
+1 I'm having the same problem here. Info: Docker version ($ docker --version): Docker version 17.12...
289
vagrant vagrant box update - Fails with 404 Not Found error
A workarround to add in your Vagrantfile: Vagrant version Host operating system Ubuntu 16.04.3 LTS G...
263
vagrant vagrant --help displays a rubygems error
To fix this error: Vagrant version Host operating system Expected behavior vagrant and vagrant --hel...
262
date fns Can't resolve 'date-fns/_lib/format/longFormatters'
You probably forgot to install date-fns or Code: import DateFnsUtils from '@date-io/date-fns'; ...
259
virtualenv Error creating virtualenv with python3.6
The original poster's problem is due to not having the 'python3.6-venv' package installed ...
252
provider A Product was used after being disposed. flutter: Once you have called dispose() on a Product, it can no longer be used.
Oh I see what you're doing Don't: DO: i have a ChangeNotifireProvider that such that i do pushReplac...
248
react navigation Reset to nested route, "There is no route defined for..."
A quick workaround (not heavily tested): set the key property to null on your action. ...
238
axios Adding headers to axios.post method
Edit: I had to add Authorization to allowed headers in my CORS filter @jffernandez I'm having the sa...
231
axios Adding Retry Parameter
@mericsson I am too in need of exponential backoff when retrying I've put together the following whi...
224
homebrew openjdk Cask adoptopenjdk8 exists in multiple taps
I think AdoptOpenJDK8 should be removed from the homebrew-cask-versions repo Somone(TM) should proba...
222
nativescript cli TNS doctor doesn't recognize Xcode (High Sierra)
HI @philipfeldmann Can yhou please run xcodebuild -version and paste the output? May be Xcode is ins...
215
axios BaseURL not being used
Please IGNORE THIS ISSUE Found the problem: I was setting baseUrl but it should be baseURL. ...