Track Awesome Data Engineering Updates Weekly
A curated list of data engineering tools for software developers
🏠 Home · 🔍 Search · 🔥 Feed · 📮 Subscribe · 😺 igorbarinov/awesome-data-engineering · ⭐ 4.6K · 🏷️ Big Data
[ Daily / Weekly / Overview ]
Feb 08 - Feb 14, 2021
- Data Council Data Council is the first technical conference that bridges the gap between data scientists, data engineers and data analysts.
Feb 04 - Feb 10, 2019
- Twitter Realtime The Streaming APIs give developers low latency access to Twitter’s global stream of Tweet data.
- GitHub Archive GitHub's public timeline since 2011, updated every hour
Aug 21 - Aug 27, 2017
- Data Engineering Podcast The show about modern data infrastructure.
Apr 10 - Apr 16, 2017
- /r/dataengineering News, tips and background on Data Engineering
- /r/etl Subreddit focused on ETL
Mar 20 - Mar 26, 2017
- Reddit Real-time data is available including comments, submissions and links posted to reddit
- Common Crawl Open source repository of web crawl data
- Wikipedia Wikipedia's complete copy of all wikis, in the form of wikitext source and metadata embedded in XML. A number of raw database tables in SQL form are also available.
Sep 07 - Sep 13, 2015
- Eventsim (⭐422) Event data simulator. Generates a stream of pseudo-random events from a set of users, designed to simulate web traffic.
Jul 20 - Jul 26, 2015
- Prometheus.io (⭐45k) An open-source service monitoring system and time series database
- HAProxy Exporter (⭐578) Simple server that scrapes HAProxy stats and exports them via HTTP for Prometheus consumption