Cloudera

2008
Palo Alto, CA, USA
Jul 10, 2020   |  By Szilard Nemeth
This blogpost will cover how customers can migrate clusters and workloads to the new Cloudera Data Platform – Data Center 7.1 (CDP DC 7.1 onwards) plus highlights of this new release. CDP DC 7.1 is the on-premises version of Cloudera Data Platform.
Jul 9, 2020   |  By Liliana Kadar
This article gives you an overview of Cloudera’s Operational Database (OpDB) performance optimization techniques. Cloudera’s Operational Database can support high-speed transactions of up to 185K/second per table and a high of 440K/second per table. On average, the recorded transaction speed is about 100K-300K/second per node. This article provides you an overview of how you can optimize your OpDB deployment in either Cloudera Data Platform (CDP) Public Cloud or Data Center.
Jul 8, 2020   |  By Wim Stoop
As organizations look to get smarter and more agile in how they gain value and insight from their data, they are now able to take advantage of a fundamental shift in architecture. In the last decade, as an industry, we have gone from monolithic machines with direct-attached storage to VMs to cloud. The main attraction of cloud is due to its separation of compute and storage – a major architectural shift in the infrastructure layer that changes the way data can be stored and processed.
Jul 8, 2020   |  By Zoltan Haindrich
In the lifecycle of a data warehouse in production, there are a variety of tasks that need to be executed on a recurring basis. To name a few concrete examples, scheduled tasks can be related to data ingestion (inserting data from a stream into a transactional table every 10 minutes), query performance (refreshing a materialized view used for BI reporting every hour), or warehouse maintenance (executing replication from one cluster to another on a daily basis).
Jul 7, 2020   |  By Marton Balassi
Our 1.2.0.0 release of Cloudera Streaming Analytics Powered by Apache Flink brings a wide range of new functionality, including support for lineage and metadata tracking via Apache Atlas, support for connecting to Apache Kudu and the first iteration of the much-awaited FlinkSQL API. Flink’s SQL interface democratizes stream processing, as it caters to a much larger community than the currently widely used Java and Scala APIs focusing on the Data Engineering crowd.
Jun 28, 2018   |  By Cloudera
Enterprises require fast, cost-efficient solutions to the familiar challenges of engaging customers, reducing risk, and improving operational excellence to stay competitive. The cloud is playing a key role in accelerating time to benefit from new insights. Managed cloud services that automate provisioning, operation, and patching will be critical for enterprises to leverage the full promise of the cloud when it comes to time to value and agility.
Jun 26, 2018   |  By Cloudera
The adoption of cloud computing in the financial services sector has grown substantially in the past three years on a global basis. Diversification of risk is always a key concern for financial institutions and the seeming safety of having a single cloud provider is not being properly measured from a systemic risk and operational risk perspective.
Jun 12, 2018   |  By Cloudera
This white paper provides a reference architecture for running Enterprise Data Hub on Oracle Cloud Infrastructure. Topics include installation automation, automated configuration and tuning, and best practices for deployment and topology to support security and high availability.
May 17, 2018   |  By Cloudera
A cloud-based analytics platform needs to be easy, unified, and enterprise-grade to meet the demands of your business. This white paper covers how Cloudera's machine learning and analytics platform complements popular cloud services like Amazon Web Services (AWS) and Microsoft Azure, and enables customers to organize, process, analyze, and store data at large scale...anywhere.
May 15, 2018   |  By Cloudera
The Modern Platform for Machine Learning and Analytics Optimized for Cloud.
Apr 21, 2020   |  By Cloudera
Today I'm going to show you how to run Data Engineering workloads on Cloudera Data Hub. First we'll deploy a Data Hub cluster with Zeppelin and Spark. Then, I'll show you an example of a pyspark job accessing data on S3. After that we'll run another pyspark job to access data in Hive.
Mar 19, 2020   |  By Cloudera
This short video is meant to provide a quick preview of the customer experience for CDP on Microsoft Azure covering topics like CDP procurement through the Azure Marketplace and key user experiences including SDX, Data Hub, Data Warehouse and Machine Learning.
Mar 11, 2020   |  By Cloudera
In this video I'll show you how to get started with Cloudera Data Warehouse in CDP public cloud. I'll walk you through activating an environment for use with the Data Warehouse experience, creating a Virtual Warehouse, and then loading in some data. After loading data in, I'll show you how to connect your Virtual Warehouse to Tableau.
Feb 18, 2020   |  By Cloudera
Cloudera Data Platform (CDP) on Public Cloud makes being an admin for a big data platform even easier thanks to SDX. Watch me spend a day at a temp position for Aperture Cybertronics as their Data Admin. I'll quickly deploy clusters, grants users access, and change performance settings such as autoscaling for the Aperture Cybertornics' staff.
Jan 31, 2020   |  By Cloudera
Cloudera Data Warehouse is just one of the many experiences you can use on the Cloudera Data Platform (CDP). Cloudera Data warehouse packages up the projects you may already know and use such as Impala and Hive into a service. This Service runs on Kubernetes which gives it the ability to pause, resume, scale up, or down quickly and automatically.