Google Cloud Dataflow is a cloud-based data processing service for both batch and real-time data streaming applications. It enables developers to set up processing pipelines for integrating, preparing and analyzing large data sets, such as those found in Web analytics or big data analytics applications.
The Cloud Dataflow software expands on earlier Google parallel processing projects, including MapReduce, which originated at the company. Cloud Dataflow is designed to bring to entire analytics pipelines the style of fast parallel execution that MapReduce brought to a single type of computational sort for batch processing jobs. 

Google Cloud Dataflow is a cloud-based data processing service for both batch and real-time data streaming applications.

PayPal Holdings, Inc. is an American multinational financial technology company operating an online payments system in the majority of countries that support online money transfers, and serves as an electronic alternative to traditional paper methods such as checks and money orders

Spotify is a Swedish audio streaming and media services provider founded on 23 April 2006 by Daniel Ek and Martin Lorentzon. It is the world's largest music streaming service provider

Dotz Inc. is a robust technology company with extensive consumer knowledge, which brings together technology, data, loyalty, marketplace and techfin in its business model, being the pioneer company in the loyalty market in Brazil through the Dotz program.

We help companies generate value and build products with big data and AI. We bring innovation from pioneers of Big Data and AI to other aspiring companies.
To enable them solve data based use cases we build data & ML platforms with them.

World's largest Beauty company, inventing the future of Beauty while transforming to #1 BeautyTech company of the future

blablacar is the world’s leading community-based travel network enabling over 100 million members to share a ride across 22 markets.

Apache Spark is an open-source unified analytics engine for large-scale data processing. Spark provides an interface for programming entire clusters with implicit data parallelism and fault tolerance.

Apache Spark™ is a unified analytics engine for large-scale data processing.

Cloud-based marketing platform for e-tailers


Fetch Rewards is a discount App for grocery shopping. The platform offers a cashback and gift card earning app that rewards purchases. The users can snap pictures of the receipts or submit e-receipts using the receipt scanner and earn the gift cards and reward points. The users can redeem the rewards on popular stores, including Target, Amazon, Walmart, and more. The company offers mobile applications for Android and iOS platforms

Fox Corporation produces and distributes compelling news, sports and entertainment content through its iconic domestic brands including: FOX News, FOX Sports, the FOX Network, and the FOX Television Stations.

Collibra is a data intelligence solution provider. It offers a cloud-based platform that connects IT and the business to build a data-based culture for the digital enterprise. It helps IT organizations to reduce complexity, risk, and costs. Collibra was a spin-off from STARLab at the Free University of Brussels. The Data Governance Center covers all key data governance and stewardship activities including Master data management, data quality, metadata management.

Heap’s mission is to power business decisions with truth. It empower companies to focus on what matters—discovering insights and taking action—not building pipelines or tagging. 

Knotch is a SaaS platform that helps brands connect their content to desired business outcomes. It's data-driven approach to content research, holistic data collection, analysis, and optimization is empowering the world’s most notable brands to fully understand their content. 

Databand.ai built the industry's first proactive data observability platform that isolates data errors as early as data integration and triages issues to alert relevant stakeholders before there's a crisis

improving the python data ecosystem with managed saas solutions to help with the deployment, production and collaboration on data projects using python

skypoint’s mission is to bring people and data together. 

we are the industry's first modern data stack platform with built-in data lakehouse, customer 360, data privacy vault, privacy compliance automation, data governance, analytics and managed services for organizations in several industries including healthcare, life sciences, senior living, retail, hospitality, business services and financial services.

industry leaders and over 10 million end users currently use skypoint.

CircleUp is on a mission to empower entrepreneurs with the funding and support that they need to thrive. having identified over a hundred successful brands, we use our helio business intelligence platform to increase the speed, quality and objectivity of decision-making in the private company landscape through a unique application of data and machine learning technology.

kyligence was founded in 2016 by the original creators of apache kylin™, the leading open source olap for big data. kyligence offers an intelligent olap platform to simplify multi-dimensional analytics for the cloud data lake. its ai-augmented engine detects patterns from most frequently asked business queries, builds governed data marts automatically, and brings metrics accountability to the data lake to optimize the data pipeline and avoid excessive numbers of tables. it provides a unified sql interface between cloud object stores, cubes, indexes, and underlying data sources with a cost-based smart query router for business intelligence, ad-hoc analytics, and data services at petabyte scale.

PingCAP is the company behind TiDB, an open-source, distributed, NewSQL database that supports hybrid transactional and analytical processing (HTAP) workloads. It is MySQL compatible and features horizontal scalability, strong consistency, and high availability.

infoprice is a technology and data company focused on pricing and dynamic pricing for retail

Pluralsight provides an online platform for programming courses. The products include assessments, analytics, role customization, professional services, and more. 

Twilio powers the future of business communications. Enabling phones, VoIP, and messaging to be embedded into web, desktop, and mobile software.



the unified messaging and streaming with apache pulsar company

a quanto ajuda a sua empresa a tomar decisões mais precisas com base nos dados do open finance.


chongtechnologies "data driven company by design" is an engineering & data science & artificial intelligence consultancy. we started in 2019 as a small group of people united by a broad vision of transforming the data industry, which empowers companies to transform data into information for decision making and improve the performance of their business and thus acquire competitive strategic advantages in relation to competition.

in addition, we position ourselves as an integrator of solutions and services capable of supporting the entire data driven journey of companies through consulting services and strategic partnerships with the main products on the market. in this way, we are able to support our customers from infrastructure, through culture to value delivery through advanced analytics.

we streamline the way companies manage, analyze and exchange high volumes of information/events, in real-time, providing data from multiple sources to countless destinations at the right time.

large companies are already using real-time data streaming to boost their business. what is the path you plan to take? the one about digital transformation, right? come with us .

dashlane's mission is to make security simple for millions of organizations and their people. we empower businesses of every size to protect company and employee data while helping everyone easily log in to the accounts they need—anytime, anywhere. over 17 million users and 20,000 businesses in 180 countries use dashlane for a faster, simpler, and more secure internet.

The Stackable Data Platform is designed with openness and flexibility in mind. It provides you with a curated selection of the best open source data apps like Apache Kafka®, Apache Druid, Trino and Apache Spark™. Stackable is your modern, open source data tool distribution enabling you to build your ideal data stack.

Fanatics is a leading global digital sports platform, complete with offerings that excite fans and maximize the reach and presence for partners across the entire sports ecosystem.  We operate more than 300 online and offline stores including e-commerce business with all major professional sports leagues (NFL, MLB, NBA, NHL, NASCAR, MLS, PGA), major media brands (NBC Sports, CBS Sports, FOX Sports) and over 300+ collegiate and professional team properties.

🧙 A modern replacement for airflow. Open-source data pipeline tool for transforming and integrating data.

Selectdb is a cloud-native real-time data warehouse developed based on the apache doris open source project by the same key developers. the selectdb company focuses on the enterprise-grade cloud-native distribution for apache doris.

Apache Doris is a real-time analytical database based on MPP architecture, known for its high performance and ease of use. It supports both high-concurrency point queries and high-throughput complex analysis. (https://github.com/apache/doris)

the most powerful no-code chatbot builder

Adevinta is an online marketplace for second-hand goods. The company offers pre-owned mobiles, furniture, vehicles, job, real estate, cars, consumer goods, and more.

Overview	Google Cloud Dataflow	Apache Spark
Categories	Data Streaming	Data Modelling and Transformation
Stage	Late Stage	Late Stage
Target Segment	Enterprise, Mid size	Mid Size, Enterprise
Deployment	SaaS	On Prem
Business Model	Commercial	Open Source
Pricing	Freemium	Freemium
Location	US	US
Companies using it
Contact info

Compare - Google Cloud Dataflow VS Apache Spark

About Google Cloud Dataflow

About Apache Spark

Comparison Table

Google Cloud Dataflow

Apache Spark

Add to compare