ELT is a process for copying data from a source system into a target system. It stands for “Extract, Load, Transform” and starts with extracting a copy of data from the source location. It’s loaded into the target system like a data warehouse, and then it’s ready to be transformed into a usable format for things like modern cloud applications.
The company Meltano provides code that manages ELT pipelines through an open-source, self-hosted, CLI-first, debuggable, and extensible process. Meltano projects manage your Singer tap and target configurations to easily select which entities and attributes to extract. These pipelines track their own incremental replication state so they can pick up where the previous run left off. Once your raw data is in its target source, Meltano helps you transform it into a usable format. These pipelines can run on a schedule and be fed to supported orchestrators like Apache Airflow.
In this episode we talk to Douwe Maan, a Co-Founder of Meltano, about their product-market fit and delivery plans.
Sponsorship inquiries: [email protected]
Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com to get 15% off the first three months of audio editing and transcription services with code: SED. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript.
Are you bored writing scripts to move data into SaaS tools like Salesforce? Hightouch is the easiest way to sync data into the tools that your business teams rely on. It’s simple — connect your data warehouse, paste a SQL query, and use our visual mapper to specify how data should appear in downstream tools. No scripts, just SQL. Get started for free at hightouch.io/sedaily.
Rockset is a real-time indexing database that indexes your data, so you get sub-second search, aggregations and joins on any type of data. It’s a cloud service for real-time analytics. Rockset is purpose-built for real-time analytics. It’s a real-time indexing database that powers sub-second analytical queries including search, aggregations and joins at cloud scale. Deliver real-time interactive analytics in your application in record time. Try now with a free 14-day trial: softwareengineeringdaily.com/rockset.
Most people use some combination of Zoom and Slack or Teams to meet with people online, but Gather has these little 8-bit spaces where you walk around with your arrow keys, and your video turns on when you get close to someone else, so you actually feel like you’re walking around an office and bumping into coworkers. It’s like a mix of Pokemon and Zoom. Anyway, dev teams have been using Gather for pair programming and team lunches, and to just do a lot of the things you’d usually do in an office. It’s free for your first 25 users if you go to gather.town/sedaily.
Act in Time with InfluxData. Easy to start, easy to scale. InfluxDB, is THE open source time series database. Purpose-built to handle the massive volumes of time-stamped data produced by IoT devices, applications, networks, containers and computers. Programmable and performant, InfluxDB gives you high granularity, high scale, and high availability. Capture, analyze, and store millions of points per second to see across all your data sources. For more information and to try it for free, visit influxdata.com/sedaily
Today’s podcast is brought to you by Google Cloud and DORA research team. The team recently launched a survey to collect insights for the 2021 State of DevOps report and would love your input! The State of DevOps report is the largest and longest running research of its kind, providing insight into how we can improve software delivery performance with DevOps. By completing the survey, you get to shape the conversation on DevOps along with over 30 thousand software professionals who took the survey over the past six years. So what are you waiting for? Take the survey at cloud.google.com/devops!