[演講投影片] Data Engineering in the Cloud

很高興今年9/30日能在在DataCon.TW台灣人的社群中分享我的Cloud經驗,這裡附上當天使用的投影片”Data Engineering in the Cloud”。

從Cloudera的角度來看,絕大多數使用Hadoop的企業仍將Hadoop部署於自己的datacenter中,不過越來越多客戶(最新統計20%)基於各種原因要將部份、或甚至全部workload搬到雲上。

雲部署的方式很多,這份投影片介紹Cloudera最近一系列的開發,包括提供private cloud(openstack)上的部署建議、熱門的雲環境介紹、一些cloud storage的技術細節等等。

當天最後是想來個Cloudera Altus的demo。Cloudera Altus是類似Amazon EMR、提供Big Data framework的PaaS服務。對於很多大企業來說(簡單來說:願意付錢買Hadoop服務的客戶),他們最關心的是資安的議題,但EMR的資安控管並不符合很多大企業的標準,而Cloudera Altus則是將原本Cloudera做得很好的on-prem security experience搬到cloud的環境中。

可惜當天前面講得太高興了,後面Altus沒有時間demo,偏偏網路不太給力。。。希望下次回台灣有機會介紹這個很酷的offering — 畢竟公司有出錢讓我坐飛機,結果在台灣玩了兩週,最重要的工商服務時間卻沒有做到 Q.Q

 

http://2017.datacon.tw/archives/131/

講題:Making Elephants Fly: Hadoop on Cloud

摘要:

“The compute infrastructure landscape is rapidly changing. At Cloudera, we are seeing more customers running Hadoop on the cloud in recent years, and we want to make sure the entire Hadoop ecosystem makes full use of the cloud, supporting more cloud infrastructures and more use cases.

In this talk, I am going to share the recent development of Hadoop/CDH on Amazon S3 and on Microsoft Azure Data Lake Storage (ADLS), including performance improvement and performance/cost comparison. Hadoop users can use the information to assess which storage offering to use that makes the most sense to them. I will also do a demo of Cloudera’s new managed Hadoop offering, Cloudera Altus, for Data Engineering use case on the cloud.”

講者簡介:

Wei-Chiu is a software engineer at Cloudera and one of the committers in Apache Hadoop project. He is mostly interested in development of large scale distributed systems.

Leave a Reply 請留下你的回應