{"id":1611,"date":"2023-03-06T12:37:50","date_gmt":"2023-03-06T11:37:50","guid":{"rendered":"https:\/\/www.loicmathieu.fr\/wordpress\/?p=1611"},"modified":"2023-03-06T17:54:30","modified_gmt":"2023-03-06T16:54:30","slug":"introduction-a-kestra","status":"publish","type":"post","link":"https:\/\/www.loicmathieu.fr\/wordpress\/informatique\/introduction-a-kestra\/","title":{"rendered":"Introduction to Kestra"},"content":{"rendered":"<p>Kestra is an open-source data orchestrator and scheduler. With Kestra, data workflows, called flows, use the YAML format and are executed by its engine via an API call, the user interface, or a trigger (webhook, schedule, SQL query, Pub\/Sub message, &#8230;).<\/p>\n<p>The important notions of Kestra are :<\/p>\n<ul><li>The <strong>flow<\/strong>: which describes how the data will be orchestrated (the workflow thus). It is a sequence of <strong> tasks <\/strong>.<\/li> \n\n<li>The <strong>task<\/strong>: a step in the <strong>flow<\/strong> that will perform an action on one or more incoming data, the <strong>inputs<\/strong>, and then generate zero or one output data, the <strong>output<\/strong>.<\/li> \n\n<li>The <strong>execution<\/strong>: which is an instance of a <strong>flow<\/strong> following a trigger.<\/li> \n\n<li>The <strong>trigger<\/strong>: which defines how a <strong>flow<\/strong> can be triggered.<\/li> \n<\/ul>\n<p>We will see these concepts in practice in the following article.<\/p>\n<p>To launch Kestra locally, the easiest way is to use the <a href=\"https:\/\/github.com\/kestra-io\/kestra\/blob\/develop\/docker-compose.yml\" rel=\"noopener nofollow\" target=\"_blank\">Docker compose provided<\/a> file, then run it via <code>docker compose up<\/code>.<\/p>\n<p>This Docker compose will run Kestra in <strong>standalone<\/strong> mode (all components in a single process) with a PostgreSQL database as a queue and repository (for storing and executing flows), and a local storage (for storing flow data).<\/p>\n<p>Kestra is composed of several components that communicate together via queues and store flow information via repositories, these queues and repositories can be implemented in different ways (memory, JDBC, Kafka, Elasticsearch). I&#8217;m not going into detail about Kestra&#8217;s architecture today, but for the more curious you can refer to the <a href=\"https:\/\/kestra.io\/docs\/architecture\/\" target=\"_blank\" rel=\"noopener\">architecture documentation<\/a> of Kestra.<\/p>\n<p>Once Kestra is started, a graphical interface will be available on port 8080: <a href=\"http:\/\/localhost:8080\" rel=\"noopener nofollow\" target=\"_blank\"><a href=\"http:\/\/localhost:8080\">http:\/\/localhost:8080<\/a><\/a>, it&#8217;s via this interface that we&#8217;ll do all the examples in this article.<\/p>\n<p>At the first launch, a Guided Tour will be offered, you can follow it or skip it to be able to follow the examples in this article.<\/p>\n<h2>My first flow<\/h2>\n<p>To create a flow, go to the <strong>Flows<\/strong> menu and then click on the <strong>Create<\/strong> button at the bottom right. Now you have a textarea in which you will be able to enter the YAML description of the flow.<\/p>\n<pre lang=\"yaml\">\nid: hello-world\nnamespace: fr.loicmathieu.example\n\ntasks:\n  - id: hello\n    type: io.kestra.core.tasks.debugs.Echo\n    format: \"Hello World\"\n<\/pre>\n<p>A flow has:<\/p>\n<ul><li>An <code>id<\/code> property that is its unique identifier within a namespace.<\/li>\n\n<li>A <code>namespace<\/code> property, namespaces are hierarchical like directories in a file system.<\/li>\n\n<li>A <code>tasks<\/code> property which is the list of tasks to be performed at the execution of the flow.<\/li>\n<\/ul>\n<p>Here we have added a single task that has three properties:<\/p>\n<ul><li>An <code>id<\/code> property that is its unique identifier within the flow.<\/li>\n\n<li>A <code>type<\/code> property which is the name of the class that represents the task.<\/li>\n\n<li>A <code>format<\/code> property, this is a property specific to Echo tasks that defines the format of the message to be logged (an Echo task is like the echo shell command).<\/li>\n<\/ul>\n<p>Each task will have its own properties, which are documented in the online documentation as well as in the documentation integrated in the Kestra graphical interface (Documentation -&gt; Plugins -&gt; then select a plugin in the right menu).<\/p>\n<p>Within the editor, the YAML description of the flow is validated and autocompletion is available for the type and properties of the tasks via CTRL+SPACE.<\/p>\n<p>Kestra is based on a plugin system that allows you to add tasks (but also triggers or conditions that we will see later). By default, there are very few plugins provided in Kestra but the Docker image used in the Docker compose is built by adding all the plugins maintained by the Kestra development team, their documentation is here : <a href=\"https:\/\/kestra.io\/plugins\/\" rel=\"noopener\" target=\"_blank\"><a href=\"https:\/\/kestra.io\/plugins\/\">https:\/\/kestra.io\/plugins\/<\/a><\/a>.<\/p>\n<p>To launch this first flow, go to the <strong>Executions<\/strong> tab and then click the <strong>New Execution<\/strong> button, which will switch to the Gantt view of the flow execution which is updated in real time with the status of the execution.<\/p>\n<img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/loicmathieu.fr\/wordpress\/wp-content\/uploads\/kestra-01-1024x267.png?resize=640%2C167&#038;ssl=1\" alt=\"\" width=\"640\" height=\"167\" class=\"alignnone size-large wp-image-1628\" srcset=\"https:\/\/i0.wp.com\/www.loicmathieu.fr\/wordpress\/wp-content\/uploads\/kestra-01.png?resize=1024%2C267&amp;ssl=1 1024w, https:\/\/i0.wp.com\/www.loicmathieu.fr\/wordpress\/wp-content\/uploads\/kestra-01.png?resize=300%2C78&amp;ssl=1 300w, https:\/\/i0.wp.com\/www.loicmathieu.fr\/wordpress\/wp-content\/uploads\/kestra-01.png?resize=768%2C200&amp;ssl=1 768w, https:\/\/i0.wp.com\/www.loicmathieu.fr\/wordpress\/wp-content\/uploads\/kestra-01.png?resize=1536%2C400&amp;ssl=1 1536w, https:\/\/i0.wp.com\/www.loicmathieu.fr\/wordpress\/wp-content\/uploads\/kestra-01.png?resize=604%2C157&amp;ssl=1 604w, https:\/\/i0.wp.com\/www.loicmathieu.fr\/wordpress\/wp-content\/uploads\/kestra-01.png?w=1631&amp;ssl=1 1631w, https:\/\/i0.wp.com\/www.loicmathieu.fr\/wordpress\/wp-content\/uploads\/kestra-01.png?w=1280&amp;ssl=1 1280w\" sizes=\"auto, (max-width: 640px) 100vw, 640px\" \/>\n<p>The <strong>Logs<\/strong> tab allows you to see the runtime logs.<\/p>\n<img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/loicmathieu.fr\/wordpress\/wp-content\/uploads\/kestra-02-1024x270.png?resize=640%2C169&#038;ssl=1\" alt=\"\" width=\"640\" height=\"169\" class=\"alignnone size-large wp-image-1629\" srcset=\"https:\/\/i0.wp.com\/www.loicmathieu.fr\/wordpress\/wp-content\/uploads\/kestra-02.png?resize=1024%2C270&amp;ssl=1 1024w, https:\/\/i0.wp.com\/www.loicmathieu.fr\/wordpress\/wp-content\/uploads\/kestra-02.png?resize=300%2C79&amp;ssl=1 300w, https:\/\/i0.wp.com\/www.loicmathieu.fr\/wordpress\/wp-content\/uploads\/kestra-02.png?resize=768%2C203&amp;ssl=1 768w, https:\/\/i0.wp.com\/www.loicmathieu.fr\/wordpress\/wp-content\/uploads\/kestra-02.png?resize=1536%2C405&amp;ssl=1 1536w, https:\/\/i0.wp.com\/www.loicmathieu.fr\/wordpress\/wp-content\/uploads\/kestra-02.png?resize=604%2C159&amp;ssl=1 604w, https:\/\/i0.wp.com\/www.loicmathieu.fr\/wordpress\/wp-content\/uploads\/kestra-02.png?w=1591&amp;ssl=1 1591w, https:\/\/i0.wp.com\/www.loicmathieu.fr\/wordpress\/wp-content\/uploads\/kestra-02.png?w=1280&amp;ssl=1 1280w\" sizes=\"auto, (max-width: 640px) 100vw, 640px\" \/>\n<p>Note the <code>Hello World<\/code> log generated by the <strong>hello<\/strong> task.<\/p>\n<p>Now let&#8217;s add an input named <strong>name<\/strong> to our flow to give it the name of the person to say hello to. To do this, we need to edit the flow by selecting the flow, either via the breadcrumb navigation at the top ( Home \/ Flows \/ fr.loicmathieu.example.hello-world), or by left-clicking on <strong>Flows<\/strong> and selecting it from the list; then go to the <strong>Source<\/strong> tab.<\/p>\n<p>We can pass data to a flow via inputs. We will define a <strong>name<\/strong> input of type <strong>STRING<\/strong>.<\/p>\n<pre lang=\"yaml\">\nid: hello-world\nnamespace: fr.loicmathieu.example\n\ninputs:\n  - type: STRING\n    name: name\n\ntasks:\n  - id: hello\n    type: io.kestra.core.tasks.debugs.Echo\n    format: \"Hello {{inputs.name}}\"\n<\/pre>\n<p>When we click on the New Execution button, a form asks us to fill in the flow inputs.<\/p>\n<img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/loicmathieu.fr\/wordpress\/wp-content\/uploads\/kestra-03-1024x262.png?resize=640%2C164&#038;ssl=1\" alt=\"\" width=\"640\" height=\"164\" class=\"alignnone size-large wp-image-1630\" srcset=\"https:\/\/i0.wp.com\/www.loicmathieu.fr\/wordpress\/wp-content\/uploads\/kestra-03.png?resize=1024%2C262&amp;ssl=1 1024w, https:\/\/i0.wp.com\/www.loicmathieu.fr\/wordpress\/wp-content\/uploads\/kestra-03.png?resize=300%2C77&amp;ssl=1 300w, https:\/\/i0.wp.com\/www.loicmathieu.fr\/wordpress\/wp-content\/uploads\/kestra-03.png?resize=768%2C197&amp;ssl=1 768w, https:\/\/i0.wp.com\/www.loicmathieu.fr\/wordpress\/wp-content\/uploads\/kestra-03.png?resize=1536%2C394&amp;ssl=1 1536w, https:\/\/i0.wp.com\/www.loicmathieu.fr\/wordpress\/wp-content\/uploads\/kestra-03.png?resize=604%2C155&amp;ssl=1 604w, https:\/\/i0.wp.com\/www.loicmathieu.fr\/wordpress\/wp-content\/uploads\/kestra-03.png?w=1600&amp;ssl=1 1600w, https:\/\/i0.wp.com\/www.loicmathieu.fr\/wordpress\/wp-content\/uploads\/kestra-03.png?w=1280&amp;ssl=1 1280w\" sizes=\"auto, (max-width: 640px) 100vw, 640px\" \/>\n<p>After entering a value for the name and running the flow, we can see that the log contains the value of the input.<\/p>\n<p><code>Hello {{inputs.name}}<\/code> is an expression that uses the Pebble templating engine that will render what is between the <code>{{<\/code> and <code>}}<\/code> moustache. Here the Pebble expression will fetch the value of the <strong>inputs.name<\/strong> variable that points to the input with name <strong>name<\/strong>.<\/p>\n<h2>Inputs and BigQuery<\/h2>\n<p>In this example, the flow will take a CSV file as input, load it into Google BigQuery, then run a query on the loaded data and display the result in the logs.<\/p>\n<p>I&#8217;m using BigQuery here because it&#8217;s easier than starting a database, but it requires a Google Cloud account and a service account set up. To do this, you need to create a Google Cloud <a href=\"https:\/\/cloud.google.com\/iam\/docs\/service-account-overview\" rel=\"noopener\" target=\"_blank\">service account<\/a> and use it in the <code>serviceAccount<\/code> property of the BigQuery task. To avoid hard-coding this variable, it is possible to use an environment variable or a cluster global variable, see the online documentation on <a href=\"https:\/\/kestra.io\/docs\/developer-guide\/variables\/\" rel=\"noopener\" target=\"_blank\">variables<\/a>.<\/p>\n<p>Without further ado, the solution ;)<\/p>\n<pre lang=\"yaml\">\nid: beer\nnamespace: fr.loicmathieu.example\ndescription: A flow to handle my beers data\n\ninputs:\n  - type: FILE\n    description: Beers data\n    name: beers\n\ntasks:\n  - id: start-log\n    type: io.kestra.core.tasks.debugs.Echo\n    format: \"{{taskrun.startDate}} - {{task.id}} - Launching the flow\"\n  - id: load-beers\n    type: io.kestra.plugin.gcp.bigquery.Load\n    serviceAccount: \"\"\n    csvOptions:\n      fieldDelimiter: \",\"\n    destinationTable: beers.beers\n    format: CSV\n    from: \"{{inputs.beers}}\"\n  - id: query-beers\n    type: io.kestra.plugin.gcp.bigquery.Query\n    serviceAccount: \"\"\n    fetchOne: true\n    sql: |\n      SELECT count(*) as count FROM beers.beers\n  - id: end-log\n    type: io.kestra.core.tasks.debugs.Echo\n    format: \"{{taskrun.startDate}} - {{outputs['query-beers'].row.count}} beers loaded\"\n<\/pre>\n<p>Here I have defined an input of type <code>FILE<\/code> named <strong>beers<\/strong>. Kestra will detect the type of the input and generate the corresponding flow execution form. This will upload a CSV file containing a list of beers. This file will be stored in Kestra&#8217;s internal storage and available in all tasks.<\/p>\n<p>The <strong>load-beers<\/strong> task of type <code>io.kestra.plugin.gcp.bigquery.Load<\/code> is used to load a CSV file into a BigQuery table. The BigQuery dataset <strong>beers<\/strong> must have been created before this task is executed. The <strong>from<\/strong> property takes a file from the internal storage (actually the URI of the file, it will be retrieved at the last moment from the internal storage). Here I use the Pebble expression <code>{{inputs.beers}}<\/code> to retrieve the file passed as input to the flow.<\/p>\n<p>The <strong>query-beers<\/strong> task of type <code>io.kestra.plugin.gcp.bigquery.Query<\/code> is used to perform a BigQuery query. There are several ways to store the query result. Here, I use the <code>fetchOne: true<\/code> property which configures the task to fetch a single row and put the result of the query in the task output. It is also possible to load all rows (<code>fetch: true<\/code>), or store the results in Kestra&#8217;s internal storage (<code>store: true<\/code>) which is recommended for queries that bring back many rows.<\/p>\n<p>The <strong>end-log<\/strong> task will write a log, we have already seen it before. Here, we want to write in the log the number of records loaded from the database, so we will get the corresponding output from the <strong>query-beers<\/strong> task via a Pebble expression: <code>{{outputs['query-beers'].row.count}}<\/code>.<\/p>\n<p>The expression <code>{{outputs['query-beers'].row.count}}<\/code> may seem intriguing:<\/p>\n<ul><li><code>outputs['query-beers']<\/code>: means the output of the <strong>query-beers<\/strong> task, so far we have seen the dotted notation (.) to access outputs, but when this one containing a &#8216;-&#8216; character, we are forced to use the subscript notation ([]) because &#8216;-&#8216; is a special character for Pebble.<\/li>\n\n<li><code>row<\/code>: is the name of the attribute set as output in the <strong>query-beers<\/strong> task, a task can have multiple attributes as outputs, refer to the task documentation for their list.<\/li>\n\n<li><code>count<\/code>: is the name of the column.<\/li>\n<\/ul>\n<h2>ForEach and file format<\/h2>\n<p>In this example, we will query the 10 most viewed Wikipedia pages using the BigQuery public dataset <strong>wikipedia.pageviews_2023<\/strong> for French, English and German languages. Then we will transform the result into CSV.<\/p>\n<pre lang=\"yaml\">\nid: wikipedia-top-ten\nnamespace: fr.loicmathieu.example\ndescription: A flow that loads wikipedia top 10 FR pages each hour\n\ntasks:\n  - id: start-log\n    type: io.kestra.core.tasks.debugs.Echo\n    format: \"{{taskrun.startDate}} - {{task.id}} - Launching the flow\"\n  - id: for-each-countries\n    type: io.kestra.core.tasks.flows.EachSequential\n    tasks:\n      - id: query-top-ten\n        type: io.kestra.plugin.gcp.bigquery.Query\n        serviceAccount: \"\"\n        sql: |\n          SELECT DATETIME(datehour) as date, title, views \n          FROM `bigquery-public-data.wikipedia.pageviews_2023` \n          WHERE DATE(datehour) = current_date() and wiki = '{{taskrun.value}}' and title not in ('Cookie_(informatique)', 'Wikip\u00e9dia:Accueil_principal', 'Sp\u00e9cial:Recherche')\n          ORDER BY datehour desc, views desc\n          LIMIT 10\n        store: true\n      - id: write-csv\n        type: io.kestra.plugin.serdes.csv.CsvWriter\n        from: \"{{outputs['query-top-ten'][taskrun.value].uri}}\"\n    value: '[\"fr\", \"en\", \"de\"]'\n<\/pre>\n<p>The <strong>for-each-countries<\/strong> task of type <code>io.kestra.core.tasks.flows.EachSequential<\/code> allows for looing. It will perform the list of child tasks several times for the values passed in its <strong>value<\/strong> property; here the languages fr, en and de.<\/p>\n<p>The <strong>query-top-ten<\/strong> task of type <code>io.kestra.plugin.gcp.bigquery.Query<\/code> will execute a query on BigQuery that will be stored in Kestra&#8217;s internal storage (<code>store: true<\/code>). It uses the Pebble expression <code>{{taskrun.value}}<\/code> which retrieves the current value from the <code>EachSequential<\/code> loop.<\/p>\n<p>The <strong>write-csv<\/strong> task of type <code>io.kestra.plugin.serdes.csv.CsvWriter<\/code> will rewrite the file stored by the previous task to the CSV format. By default, Kestra uses the Amazon Ion object storage format, so this task switches from the Ion to the CSV format. It uses the Pebble expression <code>{{outputs['query-top-ten'][taskrun.value].uri}}<\/code> whose <code>[taskrun.value].uri<\/code> attribute retrieves the value of the <strong>uri<\/strong> attribute for the current loop iteration.<\/p>\n<p>After executing the flow, you can go to the <strong>Outputs<\/strong> tab to access the task outputs and, among other things, download the CSV files generated by the flow.<\/p>\n<img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/loicmathieu.fr\/wordpress\/wp-content\/uploads\/kestra-04-1024x749.png?resize=640%2C468&#038;ssl=1\" alt=\"\" width=\"640\" height=\"468\" class=\"alignnone size-large wp-image-1631\" srcset=\"https:\/\/i0.wp.com\/www.loicmathieu.fr\/wordpress\/wp-content\/uploads\/kestra-04.png?resize=1024%2C749&amp;ssl=1 1024w, https:\/\/i0.wp.com\/www.loicmathieu.fr\/wordpress\/wp-content\/uploads\/kestra-04.png?resize=300%2C219&amp;ssl=1 300w, https:\/\/i0.wp.com\/www.loicmathieu.fr\/wordpress\/wp-content\/uploads\/kestra-04.png?resize=768%2C562&amp;ssl=1 768w, https:\/\/i0.wp.com\/www.loicmathieu.fr\/wordpress\/wp-content\/uploads\/kestra-04.png?resize=1536%2C1124&amp;ssl=1 1536w, https:\/\/i0.wp.com\/www.loicmathieu.fr\/wordpress\/wp-content\/uploads\/kestra-04.png?resize=369%2C270&amp;ssl=1 369w, https:\/\/i0.wp.com\/www.loicmathieu.fr\/wordpress\/wp-content\/uploads\/kestra-04.png?w=1590&amp;ssl=1 1590w, https:\/\/i0.wp.com\/www.loicmathieu.fr\/wordpress\/wp-content\/uploads\/kestra-04.png?w=1280&amp;ssl=1 1280w\" sizes=\"auto, (max-width: 640px) 100vw, 640px\" \/>\n<h2>Trigger<\/h2>\n<p>By default, a flow can only be runned manually via the graphical interface or the Kestra API.<\/p>\n<p>It is possible to trigger a flow from an external event, this is the role of the trigger.<\/p>\n<p>Kestra includes three basic triggers: <strong>flow<\/strong>, which allows you to trigger a flow from another flow, <strong>webhook<\/strong> which allows you to trigger a flow from a webhook URL and <strong>schedule<\/strong> which allows you to trigger a flow periodically from a cron expression. Many other triggers are available within Kestra&#8217;s plugins and allow to trigger a flow from a message in a broker, a file, or the presence of a record in a database table for example.<\/p>\n<p>The following example will trigger a flow every minute, it uses a cron expression to define its triggering periodicity.<\/p>\n<pre lang=\"yaml\">\nid: flow-with-trigger\nnamespace: fr.loicmathieu.example\n\ntriggers:\n  - id: schedule\n    type: io.kestra.core.models.triggers.types.Schedule\n    cron: \"*\/1 * * * *\"\n\ntasks:\n  - id: \"echo\"\n    type: \"io.kestra.core.tasks.debugs.Echo\"\n    format: \"{{task.id}} &gt; {{taskrun.startDate}}\"\n<\/pre>\n<h2>Data processing with Python and Pandas<\/h2>\n<p>Kestra offers advanced functionality via tasks that allow you to execute Bash, Python or NodeJS scripts.<\/p>\n<p>The following example will query the same BigQuery dataset of Wikipedia page views, write it to CSV format, and then use that CSV in a <code>io.kestra.core.tasks.scripts.Python<\/code> task that allows you to run a Python script.<\/p>\n<p>This task takes as properties:<\/p>\n<ul><li><code>inputFiles<\/code>: a file list that should contain the <code>main.py<\/code> file that will be called by the task. A second file <code>data.csv<\/code> is defined that will allow local access of the file created by the task <strong>write-csv<\/strong>. Kestra will automatically retrieve it from its internal storage and make it available in the working directory of the Python task.<\/li>\n\n<li><code>requirements<\/code>: a list of pip dependencies, here we put the Pandas library that allows to analyze the data of the CSV file.<\/li>\n<\/ul>\n<pre>\nid: wikipedia-top-ten-pyhton-panda\nnamespace: fr.loicmathieu.example\ndescription: A flow that loads wikipedia top 10 FR pages each hour\n\ntasks:\n  - id: query-top-ten\n    type: io.kestra.plugin.gcp.bigquery.Query\n    serviceAccount: \"\"\n    sql: |\n      SELECT DATETIME(datehour) as date, title, views \n      FROM `bigquery-public-data.wikipedia.pageviews_2023` \n      WHERE DATE(datehour) = current_date() and wiki = 'fr' and title not in ('Cookie_(informatique)', 'Wikip\u00e9dia:Accueil_principal', 'Sp\u00e9cial:Recherche')\n      ORDER BY datehour desc, views desc\n      LIMIT 10\n    store: true\n  - id: write-csv\n    type: io.kestra.plugin.serdes.csv.CsvWriter\n    from: \"{{outputs['query-top-ten'].uri}}\"\n  - id: \"python\"\n    type: io.kestra.core.tasks.scripts.Python\n    inputFiles:\n      data.csv: \"{{outputs['write-csv'].uri}}\"\n      main.py: |\n        import pandas as pd\n        from kestra import Kestra\n        data = pd.read_csv(\"data.csv\")\n        data.info()\n        sumOfViews = data['views'].sum()\n        Kestra.outputs({'sumOfViews': int(sumOfViews)})\n    requirements:\n      - pandas\n<\/pre>\n<p>The Python script will use Pandas to read the CSV file and transform it into a Pandas data frame, then perform the sum of the <strong>views<\/strong> column. This sum will then be put into the task output using the Python Kestra library.<\/p>\n<p>Here are the execution logs of this flow.<\/p>\n<img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" src=\"https:\/\/i0.wp.com\/loicmathieu.fr\/wordpress\/wp-content\/uploads\/kestra-05-1024x414.png?resize=640%2C259&#038;ssl=1\" alt=\"\" width=\"640\" height=\"259\" class=\"alignnone size-large wp-image-1632\" srcset=\"https:\/\/i0.wp.com\/www.loicmathieu.fr\/wordpress\/wp-content\/uploads\/kestra-05.png?resize=1024%2C414&amp;ssl=1 1024w, https:\/\/i0.wp.com\/www.loicmathieu.fr\/wordpress\/wp-content\/uploads\/kestra-05.png?resize=300%2C121&amp;ssl=1 300w, https:\/\/i0.wp.com\/www.loicmathieu.fr\/wordpress\/wp-content\/uploads\/kestra-05.png?resize=768%2C311&amp;ssl=1 768w, https:\/\/i0.wp.com\/www.loicmathieu.fr\/wordpress\/wp-content\/uploads\/kestra-05.png?resize=1536%2C621&amp;ssl=1 1536w, https:\/\/i0.wp.com\/www.loicmathieu.fr\/wordpress\/wp-content\/uploads\/kestra-05.png?resize=604%2C244&amp;ssl=1 604w, https:\/\/i0.wp.com\/www.loicmathieu.fr\/wordpress\/wp-content\/uploads\/kestra-05.png?w=1610&amp;ssl=1 1610w, https:\/\/i0.wp.com\/www.loicmathieu.fr\/wordpress\/wp-content\/uploads\/kestra-05.png?w=1280&amp;ssl=1 1280w\" sizes=\"auto, (max-width: 640px) 100vw, 640px\" \/>\n<h2>Conclusion<\/h2>\n<p>In this introductory article, we have seen the main concepts of Kestra and some flow examples.<\/p>\n<p>To go further, you can check out the <a href=\"https:\/\/kestra.io\/docs\/\" rel=\"noopener\" target=\"_blank\">online documentation<\/a> as well as the <a href=\"https:\/\/kestra.io\/plugins\/\" rel=\"noopener\" target=\"_blank\">plugin list<\/a>.<\/p>\n<p>Ketra is an open source community project available on GitHub, feel free to:<\/p>\n<ul><li>Put a star or open an issue on its <a href=\"https:\/\/github.com\/kestra-io\/kestra\" rel=\"noopener\" target=\"_blank\">repository<\/a>.<\/li>\n\n<li>Follow Kestra on <a href=\"https:\/\/twitter.com\/kestra_io\" rel=\"noopener\" target=\"_blank\">Twitter<\/a> or <a href=\"https:\/\/www.linkedin.com\/company\/kestra\/\" rel=\"noopener\" target=\"_blank\">Linkedin<\/a>.<\/li>\n\n<li>Contact the team on <a href=\"https:\/\/api.kestra.io\/v1\/communities\/slack\/redirect\" rel=\"noopener\" target=\"_blank\">Slack<\/a><\/li>\n<\/ul>\n<p><\/p>","protected":false},"excerpt":{"rendered":"<p>Kestra is an open-source data orchestrator and scheduler. With Kestra, data workflows, called flows, use the YAML format and are executed by its engine via an API call, the user interface, or a trigger (webhook, schedule, SQL query, Pub\/Sub message, &#8230;). The important notions of Kestra are : The flow: which describes how the data will be orchestrated (the workflow thus). It is a sequence of tasks . The task: a step in the flow that will perform an action&#8230;<p class=\"read-more\"><a class=\"btn btn-default\" href=\"https:\/\/www.loicmathieu.fr\/wordpress\/informatique\/introduction-a-kestra\/\"> Read More<span class=\"screen-reader-text\">  Read More<\/span><\/a><\/p><\/p>","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"activitypub_content_warning":"","activitypub_content_visibility":"","activitypub_max_image_attachments":4,"activitypub_interaction_policy_quote":"anyone","activitypub_status":"","footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2},"jetpack_post_was_ever_published":false},"categories":[9],"tags":[211,212],"class_list":["post-1611","post","type-post","status-publish","format-standard","hentry","category-informatique","tag-kestra","tag-orchestrator"],"aioseo_notices":[],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack_likes_enabled":true,"jetpack-related-posts":[{"id":1786,"url":"https:\/\/www.loicmathieu.fr\/wordpress\/informatique\/concevoir-un-saas-multitenant\/","url_meta":{"origin":1611,"position":0},"title":"Designing a multi-tenant SaaS","author":"admin","date":"Tuesday March  5th, 2024","format":false,"excerpt":"This article is based on my talk Designing a multi-tenant SaaS given at Cloud Nord on October 12, 2023. Kestra is a highly scalable data scheduling and orchestration platform that creates, executes, schedules and monitors millions of complex pipelines. For an introduction to Kestra, you can read my article on\u2026","rel":"","context":"In &quot;informatique&quot;","block_context":{"text":"informatique","link":"https:\/\/www.loicmathieu.fr\/wordpress\/category\/informatique\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/loicmathieu.fr\/wordpress\/wp-content\/uploads\/kestra-software-architecture-1024x576.png?resize=350%2C200&ssl=1","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/loicmathieu.fr\/wordpress\/wp-content\/uploads\/kestra-software-architecture-1024x576.png?resize=350%2C200&ssl=1 1x, https:\/\/i0.wp.com\/loicmathieu.fr\/wordpress\/wp-content\/uploads\/kestra-software-architecture-1024x576.png?resize=525%2C300&ssl=1 1.5x"},"classes":[]},{"id":1606,"url":"https:\/\/www.loicmathieu.fr\/wordpress\/informatique\/au-revoir-zenika-bonjour-kestra\/","url_meta":{"origin":1611,"position":1},"title":"(Fran\u00e7ais) Au revoir Zenika, bonjour Kestra","author":"admin","date":"Tuesday January 10th, 2023","format":false,"excerpt":"Sorry, this entry is only available in Fran\u00e7ais.","rel":"","context":"In &quot;informatique&quot;","block_context":{"text":"informatique","link":"https:\/\/www.loicmathieu.fr\/wordpress\/category\/informatique\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":1650,"url":"https:\/\/www.loicmathieu.fr\/wordpress\/informatique\/le-profiler-sql-de-visualvm\/","url_meta":{"origin":1611,"position":2},"title":"VisualVM SQL profiler SQL","author":"admin","date":"Tuesday April  4th, 2023","format":false,"excerpt":"A little while ago, I discovered the SQL profiler of VisualVM and I thought I should share it with you ;). VisualVM is a tool that provides a visual interface to display detailed information about applications running on a Java Virtual Machine (JVM). VisualVM is designed for use in development\u2026","rel":"","context":"In &quot;informatique&quot;","block_context":{"text":"informatique","link":"https:\/\/www.loicmathieu.fr\/wordpress\/category\/informatique\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/loicmathieu.fr\/wordpress\/wp-content\/uploads\/Capture-decran-du-2023-04-03-14-17-25-1024x624.png?resize=350%2C200&ssl=1","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/loicmathieu.fr\/wordpress\/wp-content\/uploads\/Capture-decran-du-2023-04-03-14-17-25-1024x624.png?resize=350%2C200&ssl=1 1x, https:\/\/i0.wp.com\/loicmathieu.fr\/wordpress\/wp-content\/uploads\/Capture-decran-du-2023-04-03-14-17-25-1024x624.png?resize=525%2C300&ssl=1 1.5x"},"classes":[]},{"id":2007,"url":"https:\/\/www.loicmathieu.fr\/wordpress\/informatique\/characterescapes-jackson-s-hidden-gem\/","url_meta":{"origin":1611,"position":3},"title":"CharacterEscapes: Jackson&#8217;s hidden gem","author":"admin","date":"Wednesday September 10th, 2025","format":false,"excerpt":"At Kestra, the data orchestration platform I work for, we had an issue ([#10326] (https:\/\/github.com\/kestra-io\/kestra\/issues\/10326)) opened by a user reporting a problem with the PostgreSQL database and the Unicode character \\u0000. A workflow task that returned this character in its output was failing. After investigation, PostgreSQL refuses to store a\u2026","rel":"","context":"In &quot;informatique&quot;","block_context":{"text":"informatique","link":"https:\/\/www.loicmathieu.fr\/wordpress\/category\/informatique\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":1847,"url":"https:\/\/www.loicmathieu.fr\/wordpress\/informatique\/jooq-tip-ne-convertissez-pas-jsonb-en-string\/","url_meta":{"origin":1611,"position":4},"title":"jOOQ tip: don&#8217;t convert JSONB to a String","author":"admin","date":"Wednesday October 23rd, 2024","format":false,"excerpt":"A few weeks ago, while investigating possible performance improvements for Kestra's JDBC backend, I noticed that a method we were using to map an entity to be persisted in the database into its JSONB representation was taking up a lot of time in our CPU profiles. In the following flame\u2026","rel":"","context":"In &quot;informatique&quot;","block_context":{"text":"informatique","link":"https:\/\/www.loicmathieu.fr\/wordpress\/category\/informatique\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/loicmathieu.fr\/wordpress\/wp-content\/uploads\/366507373-c44206a5-0085-43cc-902e-97756319b0ea-1024x737.png?resize=350%2C200&ssl=1","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/loicmathieu.fr\/wordpress\/wp-content\/uploads\/366507373-c44206a5-0085-43cc-902e-97756319b0ea-1024x737.png?resize=350%2C200&ssl=1 1x, https:\/\/i0.wp.com\/loicmathieu.fr\/wordpress\/wp-content\/uploads\/366507373-c44206a5-0085-43cc-902e-97756319b0ea-1024x737.png?resize=525%2C300&ssl=1 1.5x"},"classes":[]},{"id":1731,"url":"https:\/\/www.loicmathieu.fr\/wordpress\/informatique\/optimisation-dindex-postgresql\/","url_meta":{"origin":1611,"position":5},"title":"PostgreSQL index optimization","author":"admin","date":"Tuesday August 22nd, 2023","format":false,"excerpt":"Some time ago, I worked on query execution time optimizations for PostgreSQL, I talk about it here: The VISUALVM SQL PROFILE. Kestra is a highly scalable data orchestration and scheduling platform that creates, executes, schedules, and monitors millions of complex pipelines. It's also the company I work for! The open\u2026","rel":"","context":"In &quot;informatique&quot;","block_context":{"text":"informatique","link":"https:\/\/www.loicmathieu.fr\/wordpress\/category\/informatique\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]}],"_links":{"self":[{"href":"https:\/\/www.loicmathieu.fr\/wordpress\/wp-json\/wp\/v2\/posts\/1611","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.loicmathieu.fr\/wordpress\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.loicmathieu.fr\/wordpress\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.loicmathieu.fr\/wordpress\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.loicmathieu.fr\/wordpress\/wp-json\/wp\/v2\/comments?post=1611"}],"version-history":[{"count":25,"href":"https:\/\/www.loicmathieu.fr\/wordpress\/wp-json\/wp\/v2\/posts\/1611\/revisions"}],"predecessor-version":[{"id":1648,"href":"https:\/\/www.loicmathieu.fr\/wordpress\/wp-json\/wp\/v2\/posts\/1611\/revisions\/1648"}],"wp:attachment":[{"href":"https:\/\/www.loicmathieu.fr\/wordpress\/wp-json\/wp\/v2\/media?parent=1611"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.loicmathieu.fr\/wordpress\/wp-json\/wp\/v2\/categories?post=1611"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.loicmathieu.fr\/wordpress\/wp-json\/wp\/v2\/tags?post=1611"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}