Palo Alto, CA, USA
Jun 10, 2021   |  By Gokul Kamaraj
Apache Spark is a very popular analytics engine used for large-scale data processing. It is widely used for many big data applications and use cases. CDP Operational Database Experience Experience (COD) is a CDP Public Cloud service that lets you create and manage operational database instances and it is powered by Apache HBase and Apache Phoenix.
Jun 8, 2021   |  By David LeGrand
If you have read our previous post focusing on the challenges of planning, launching and scaling IIOT use cases, you’ve narrowed down the business problems you’re trying to solve, and you have a plan that is both created by the implementation team and supported by executive management. Here’s a plan to make sure you’ve got it all down. Think of these success factors like the legs of a kitchen table and the results that you desire, a bowl of homemade chicken soup.
Jun 7, 2021   |  By Gyula Fora
At the end of March, we released the first version of Cloudera SQL StreamBuilder as part of CSA 1.3. It enabled users to easily write, run and manage real-time SQL queries on streams from Apache Kafka with an exceptionally smooth user experience. Since then, we have been working hard to expose the full power of Apache Flink SQL and the existing Data Warehousing tools in CDP to combine it into a state-of-the-art real-time analytics platform.
Jun 7, 2021   |  By Dinesh Chandrasekhar
Cloudera has been named as a Strong Performer in the Forrester Wave for Streaming Analytics, Q2 2021. We are excited to be recognized in this wave at, what we consider to be, such a strong position. We are proud to have been named as one of “The 14 providers that matter most” in streaming analytics. The report states that richness of analytics, development tool options and near-effortless scalability are what streaming analytics customers should look for in a provider.
Jun 7, 2021   |  By Kenny Gorman
In October of 2020 Cloudera acquired Eventador and Cloudera Streaming Analytics (CSA) 1.3.0 was released early in 2021. It was the first release to incorporate SQL Stream Builder (SSB) from the acquisition, and brought rich SQL processing to the already robust Apache Flink offering. The team’s focus turned to bringing Flink Data Definition Language (DDL) and the batch interface into SSB with that completed.
Jun 3, 2021   |  By Alexander Lavoie
Cloudera Support’s cluster validations proactively identify known problem signatures contained in customers’ diagnostic data with the goal of increasing cluster health, performance, and overall stability. Cluster validations are included in a customer’s enterprise subscription at no additional cost. All customers with access to the Support case portal will also be able to take advantage of cluster validations.
Jun 2, 2021   |  By Xiaoyu Yao
Apache Ozone is a distributed object store built on top of Hadoop Distributed Data Store service. It can manage billions of small and large files that are difficult to handle by other distributed file systems. As an important part of achieving better scalability, Ozone separates the metadata management among different services: Ozone Manager (OM) service manages the metadata of the namespace such as volume, bucket and keys.
Jun 2, 2021   |  By Joydeep Das
Data pipelines are in high demand in today’s data-driven organizations. As critical elements in supplying trusted, curated, and usable data for end-to-end analytic and machine learning workflows, the role of data pipelines is becoming indispensable. To keep up, data pipelines are being vigorously reshaped with modern tools and techniques.
Jun 1, 2021   |  By Mick Hollison
Today marks the beginning of an exciting new chapter for Cloudera. Cloudera will become a private company with the flexibility and resources to accelerate product innovation, cloud transformation and customer growth.
May 27, 2021   |  By Cloudera Contributors
This blog post was written by Pedro Pereira as a guest author for Cloudera. Right now, someone somewhere is writing the next fake news story or editing a deepfake video. An authoritarian regime is manipulating an artificial intelligence (AI) system to spy on technology users. No matter how good the intentions behind the development of a technology, someone is bound to corrupt and manipulate it. Big data and AI amplify the problem. “If you have good intentions, you can make it very good.
Jun 8, 2021   |  By Cloudera
Join us live to discuss the latest features in CDP Public Cloud!
Jun 8, 2021   |  By Cloudera
In this meetup, we’re going to put ourselves in the shoes of an electric car manufacturer that produces all the parts for their cars in house. First, we’ll show you an example on how this fictional car company could walk through the process of creating a prediction model based on part production data. We will then automate the creation of these models by making them depending on an upstream data collection process. To finish it off, we’ll deploy these models and make them accessible via an external API all within a native cloud environment using the Cloudera Data Platform.
Jun 2, 2021   |  By Cloudera
Join us live with Fast Forward Labs to discuss the recently possible in Machine Learning and AI. Being able to recommend an item of interest to a user (based on their past preferences) is a highly relevant problem in practice. A key trend over the past few years has been session-based recommendation algorithms that provide recommendations solely based on a user’s interactions in an ongoing session, and which do not require the existence of user profiles or their entire historical preferences. This report explores a simple, yet powerful, NLP-based approach (word2vec) to recommend a next item to a user. While NLP-based approaches are generally employed for linguistic tasks, here we exploit them to learn the structure induced by a user’s behavior or an item’s nature.
Jun 2, 2021   |  By Cloudera
Full data lifecycle projects hold tremendous potential for organizations to uncover new insights and drivers of revenue and profitability. Big Data has brought the promise of doing device data capture, data enrichment, data science, and analytics at scale to enterprises. This promise also comes with challenges for developers, admins, and consumers to continuously access new data and collaborate.
May 20, 2021   |  By Cloudera
How do you get your data from A to B? We take you on a journey with your data through: Join us to find out more about managing your data lifecycle, and see it in action during our demo. AGENDA 18:00 - Welcome 18:05 - Best Practice: Streaming Data & Analytics 18:20 - Demo: Collect, Curate, Predict & Visualise your Streaming Data 19:00 - Open Networking 19:30 - END
May 18, 2021   |  By Cloudera
Join us LIVE to discuss what's new in CDP Public Cloud! We will discuss new features with Data Flow and Operational Database. Don't miss the live Q&A and a chance to win swag!
May 13, 2021   |  By Cloudera
Kafka Summit Europe 2021 takes places May 11-12, what will be the major takeaways and interesting points from the event? Join us, live, to discuss what we think are the most important things to know. Take advantage of live Q&A with some of Cloudera's event streaming experts.
May 6, 2021   |  By Cloudera
Continuous SQL is using Structured Query Language (SQL) to create computations against unbounded streams of data, and show the results in a persistent storage. The result stored in a persistent storage can be connected to other applications to have an analytical visualization of your data. Compared to traditional SQL, in Continuous SQL the data has a start, but no end. This means that queries continuously process results to a sink or other target types. When you define your job in SQL, the SQL statement is interpreted and validated against a schema. After the statement is executed, the results that match the criteria are continuously returned.
Apr 29, 2021   |  By Cloudera
In this meetup, we’re going to once again put ourselves in the shoes of an electric car manufacturer that is deploying a recently developed electric motor out into their new cars. We’re going to show how to explore some data that has been previously collected through various different sources and stored into Apache Hive within a data warehouse, with the goal of tracking down a specific set of potentially defective parts. We’ll then take the results of this data exploration and create an interactive dashboard that presents our results in a visually appealing way using a BI tool that’s integrated right into the same data warehouse.
Apr 29, 2021   |  By Cloudera
Join us for this month's Machine Learning research discussion with Cloudera Fast Forward Labs. We will discuss few-shot text classification - including a live demo and Q&A. This is an applied research report by Cloudera Fast Forward. We write reports about emerging technologies. Accompanying each report are working prototypes or code that exhibits the capabilities of the algorithm and offer detailed technical advice on its practical application.
Jun 28, 2018   |  By Cloudera
Enterprises require fast, cost-efficient solutions to the familiar challenges of engaging customers, reducing risk, and improving operational excellence to stay competitive. The cloud is playing a key role in accelerating time to benefit from new insights. Managed cloud services that automate provisioning, operation, and patching will be critical for enterprises to leverage the full promise of the cloud when it comes to time to value and agility.
Jun 26, 2018   |  By Cloudera
The adoption of cloud computing in the financial services sector has grown substantially in the past three years on a global basis. Diversification of risk is always a key concern for financial institutions and the seeming safety of having a single cloud provider is not being properly measured from a systemic risk and operational risk perspective.
Jun 12, 2018   |  By Cloudera
This white paper provides a reference architecture for running Enterprise Data Hub on Oracle Cloud Infrastructure. Topics include installation automation, automated configuration and tuning, and best practices for deployment and topology to support security and high availability.
May 17, 2018   |  By Cloudera
A cloud-based analytics platform needs to be easy, unified, and enterprise-grade to meet the demands of your business. This white paper covers how Cloudera's machine learning and analytics platform complements popular cloud services like Amazon Web Services (AWS) and Microsoft Azure, and enables customers to organize, process, analyze, and store data at large scale...anywhere.
May 15, 2018   |  By Cloudera
The Modern Platform for Machine Learning and Analytics Optimized for Cloud.
Mar 25, 2018   |  By Cloudera
In the wake of the global financial crisis, the world has become much more interconnected and immensely more complex. As a result, you can no longer simply look at the past as an indicator of future trends. The financial services industry needs real-time insights into numerous interacting variables to make informed decisions.

Cloudera delivers the modern platform for machine learning and analytics optimized for the cloud. Imagine having access to all your data in one platform. The opportunities are endless. We enable you to transform vast amounts of complex data into clear and actionable insights to enhance your business and exceed your expectations.

The right products for the job:

  • Enterprise Data Hub: Operate with confidence—thanks to comprehensive security and governance—while at the same time enabling unrivaled self-service performance at extreme scale. All in an enterprise-grade solution that lets you run anywhere, on-premises or in hybrid- and multi-cloud environments.
  • Data Science Workbench: Accelerate machine learning from research to production with the secure, self-service enterprise data science platform built for the enterprise.
  • Data Warehouse: A modern data warehouse that delivers an enterprise-grade, hybrid cloud solution designed for self-service analytics.
  • Data Science & Engineering: Cloudera Data Science provides better access to Apache Hadoop data with familiar and performant tools that address all aspects of modern predictive analytics.
  • Altus Cloud: The industry’s first machine learning and analytics cloud platform built with a shared data experience.

The world’s leading organizations choose Cloudera to grow their businesses, improve lives, and advance human achievement.