欢迎来到课桌文档! | 帮助中心 课桌文档-建筑工程资料库
课桌文档
全部分类
  • 党建之窗>
  • 感悟体会>
  • 百家争鸣>
  • 教育整顿>
  • 文笔提升>
  • 热门分类>
  • 计划总结>
  • 致辞演讲>
  • 在线阅读>
  • ImageVerifierCode 换一换
    首页 课桌文档 > 资源分类 > DOC文档下载  

    CarbonData在BI场景中的测试.doc

    • 资源ID:21141       资源大小:58KB        全文页数:6页
    • 资源格式: DOC        下载积分:10金币
    快捷下载 游客一键下载
    会员登录下载
    三方登录下载: 微信开放平台登录 QQ登录  
    下载资源需要10金币
    邮箱/手机:
    温馨提示:
    用户名和密码都是您填写的邮箱或者手机号,方便查询和重复下载(系统自动生成)
    支付方式: 支付宝    微信支付   
    验证码:   换一换

    加入VIP免费专享
     
    账号:
    密码:
    验证码:   换一换
      忘记密码?
        
    友情提示
    2、PDF文件下载后,可能会被浏览器默认打开,此种情况可以点击浏览器菜单,保存网页到桌面,就可以正常下载了。
    3、本站不支持迅雷下载,请使用电脑自带的IE浏览器,或者360浏览器、谷歌浏览器下载即可。
    4、本站资源下载后的文档和图纸-无水印,预览文档经过压缩,下载后原文更清晰。
    5、试题试卷类文档,如果标题没有明确说明有答案则都视为没有答案,请知晓。

    CarbonData在BI场景中的测试.doc

    -CarbonData在BI场景中的测试测试目的CarbonData格式与Parquet格式在BI场景下的性能比照测试版本集群配置cpu:Intel(R) Core(TM) i7-4820K CPU  3.70GHz 1U *4Coremem: 64G节点个数:5测试素材使用*BI工程中的事实表作为测试表,建表DDL语句如下:create table isdzd_b0_10y_c(userid_   string,    btype_    string,     shtag_    string,     shtag2_   string,     option_   string,    f2        double,         f3        double,         f4        double,         f5        double,         f6        double,         f7        double,         f8        double,         f9        double,         f10       double,         f11       double,         f12       double,         f13       double,         f14       double,         f15       double,         f16       double,         f17       double,         f18       double,         f19       double,         f20       double,         f21       double,         f22       double,         f23       double,         f24       double,         f25       double,         f26       double,         f27       double,         f28       double,         f29       double,         f30       double,         f31       double,         f32       double,         f33       double,         f34       double,         f35       double,         f36       double,         f37       double,         f38       double,         f39       double,         f40       double,         f41       double,         f42       double,         f43       double,         f44       double,         f45       double,         f46       double,         f47       double,         f48       double,         f49       double,         f50       double,         f51       double,         f52       double,         f53       double,         f54       double,         f55       double,         f56       double,         f57       double,         f58       double,         f59       double,         f60       double,         f61       double,         f62       double,         f63       double,         f64       double,         f65       double,         f66       double,         f67       double,         f68       double,         f69       double,         f70       double,         f71       double,         f72       double,         f73       double,         f74       double,         f75       double,         f76       double,         f77       double,         f78       double,         f79       double,         f80       double,         f81       double,         f82       double,         f83       double,         f84       double,         f85       double,         f86       double,         f87       double,         f88       double,         f89       double,         f90       double,         f91       double,         f92       double,         f93       double,         f94       double,         f95       double,         f96       double,         f97       double,         f98       double,         f99       double,         f100      double,         f101      double,         f102      double,         f103      double,         f104      double,         f105      double,         f106      double,         f107      double,         f108      double,         f109      double,         f110      double,         f111      double,         f112      double,         f113      double,         f114      double,         nsrsbh    string,    nsrmc     string,   hydm      string,    djzcl*    string,    *zqh      string,     zzsjnfs   string,    zzsyhzc   string,    upid_     string,bbq_   string     ) STORED BY 'carbondata'测试方法分别测试在2亿和10亿的数据量下,CarbonData与parquet格式在ad-hoc场景下的性能优劣。测试用例场景用例sql用例即席分析_1个维、1个指标select sum(a.F7) as F70, SUBSTR(a.*ZQH, 1, 2) as *ZQH0from isdzd_b0_10y_c awhere (a.HYDM IS NOT NULL)group by SUBSTR(a.*ZQH, 1, 2) order by true limit 600;即席分析_ 2个维、5个指标select sum(a.F7) as F70,sum(a.F8) as F80,sum(a.F9) as F90,sum(a.F12) as F120,sum(a.F13) as F130, SUBSTR(a.*ZQH,1,2) as *ZQH0, SUBSTR(a.DJZCL*,1,1) as DJZCL*0from ISDZD_B0_10Y_c awhere (a.HYDM IS NOT NULL)group by SUBSTR(a.*ZQH,1,2), SUBSTR(a.DJZCL*,1,1) order by true limit 600;一般报表_1个维,5个指标select count(a.NSRSBH) as C2,sum(a.F15) as D2,sum(a.F25) as E2,sum(a.F30) as F2,sum(a.F33) as G2,sum(a.F10) as H2, SUBSTR(a.*ZQH, 1, 2) as B2from isdzd_b0_10y_c awhere (a.HYDM IS NOT NULL) AND (a.BBQ_ like '2021%') AND (2>1)group by SUBSTR(a.*ZQH, 1, 2)order by sum(a.F10) desc nulls last limit 1000;一般报表_1个维,20个指标select C2 as C2,D2 as D2,E2 as E2,F2 as F2,G2 as G2,H2 as H2,I2 as I2,J2 as J2,K2 as K2,L2 as L2,M2 as M2,N2 as N2,O2 as O2,P2 as P2,Q2 as Q2,R2 as R2,S2 as S2,T2 as T2,U2 as U2,V2 as V2,B2 as B2,B2_sortn as B2_sortnfrom (select sum(a.F15) as C2,sum(a.F25) as D2,sum(a.F30) as E2,sum(a.F33) as F2,sum(a.F10) as G2,sum(a.F37) as H2,sum(a.F42) as I2,sum(a.F48) as J2,sum(a.F49) as K2,sum(a.F50) as L2,sum(a.F52) as M2,sum(a.F53) as N2,sum(a.F54) as O2,sum(a.F55) as P2,sum(a.F56) as Q2,sum(a.F57) as R2,sum(a.F59) as S2,sum(a.F60) as T2,sum(a.F60) as U2,sum(a.F58) as V2, SUBSTR(a.*ZQH, 1, 2) as B2,row_number() over ( order by sum(a.F10) desc nulls last) as B2_sortnfrom isdzd_b0_10y_c awhere (a.HYDM IS NOT NULL) AND (a.BBQ_ like '2021%') AND (2>1)group by SUBSTR(a.*ZQH, 1, 2) awhere (B2_sortn<=40)order by B2_sortn nulls first limit 1000;多级浮动_ 2个维,5个指标select C3 as C3,D3 as D3,E3 as E3,F3 as F3,G3 as G3,C2 as C2,D2 as D2,E2 as E2,F2 as F2,G2 as G2,B3 as B3,B2 as B2,B2_sortn as B2_sortnfrom (select a.C3 as C3,a.D3 as D3,a.E3 as E3,a.F3 as F3,a.G3 as G3,b.C2 as C2,b.D2 as D2,b.E2 as E2,b.F2 as F2,b.G2 as G2,a.B3 as B3,a.B2 as B2,row_number() over (partition by a.B3 order by b.G2 desc nulls last) as B2_sortnfrom (select sum(a.F15) as C3,sum(a.F25) as D3,sum(a.F30) as E3,sum(a.F33) as F3,sum(a.F10) as G3, SUBSTR(a.DJZCL*, 1, 1) as B3, SUBSTR(a.*ZQH, 1, 2) as B2from isdzd_b0_10y_c awhere (a.HYDM IS NOT NULL) AND (a.BBQ_ like '2021%') AND (2>1)group by SUBSTR(a.*ZQH, 1, 2), SUBSTR(a.DJZCL*, 1, 1) aleft join (select sum(a.F15) as C2,sum(a.F25) as D2,sum(a.F30) as E2,sum(a.F33) as F2,sum(a.F10) as G2, SUBSTR(a.*ZQH, 1, 2) as B2from isdzd_b0_10y_c awhere (a.HYDM IS NOT NULL) AND (a.BBQ_ like '2021%') AND (2>1)group by SUBSTR(a.*ZQH, 1, 2) bon (a.B2=b.B2) awhere (B2_sortn<=40)order by B3 nulls first,B2_sortn nulls first limit 1000;多级浮动_ 2个维,20个指标select C3 as C3,D3 as D3,E3 as E3,F3 as F3,G3 as G3,H3 as H3,I3 as I3,J3 as J3,K3 as K3,L3 as L3,M3 as M3,N3 as N3,O3 as O3,P3 as P3,Q3 as Q3,R3 as R3,S3 as S3,T3 as T3,U3 as U3,V3 as V3,C2 as C2,D2 as D2,E2 as E2,F2 as F2,G2 as G2,H2 as H2,I2 as I2,J2 as J2,K2 as K2,L2 as L2,M2 as M2,N2 as N2,O2 as O2,P2 as P2,Q2 as Q2,R2 as R2,S2 as S2,T2 as T2,U2 as U2,V2 as V2,B3 as B3,B2 as B2,B2_sortn as B2_sortnfrom (select a.C3 as C3,a.D3 as D3,a.E3 as E3,a.F3 as F3,a.G3 as G3,a.H3 as H3,a.I3 as I3,a.J3 as J3,a.K3 as K3,a.L3 as L3,a.M3 as M3,a.N3 as N3,a.O3 as O3,a.P3 as P3,a.Q3 as Q3,a.R3 as R3,a.S3 as S3,a.T3 as T3,a.U3 as U3,a.V3 as V3,b.C2 as C2,b.D2 as D2,b.E2 as E2,b.F2 as F2,b.G2 as G2,b.H2 as H2,b.I2 as I2,b.J2 as J2,b.K2 as K2,b.L2 as L2,b.M2 as M2,b.N2 as N2,b.O2 as O2,b.P2 as P2,b.Q2 as Q2,b.R2 as R2,b.S2 as S2,b.T2 as T2,b.U2 as U2,b.V2 as V2,a.B3 as B3,a.B2 as B2,row_number() over (partition by a.B3 order by b.G2 desc nulls last) as B2_sortnfrom (select sum(a.F15) as C3,sum(a.F25) as D3,sum(a.F30) as E3,sum(a.F33) as F3,sum(a.F10) as G3,sum(a.F37) as H3,sum(a.F42) as I3,sum(a.F48) as J3,sum(a.F49) as K3,sum(a.F50) as L3,sum(a.F52) as M3,sum(a.F53) as N3,sum(a.F54) as O3,sum(a.F55) as P3,sum(a.F56) as Q3,sum(a.F57) as R3,sum(a.F59) as S3,sum(a.F60) as T3,sum(a.F60) as U3,sum(a.F58) as V3, SUBSTR(a.DJZCL*, 1, 1) as B3, SUBSTR(a.*ZQH, 1, 2) as B2from isdzd_b0_10y_c awhere (a.HYDM IS NOT NULL) AND (a.BBQ_ like '2021%') AND (2>1)group by SUBSTR(a.*ZQH, 1, 2), SUBSTR(a.DJZCL*, 1, 1) aleft join (select sum(a.F15) as C2,sum(a.F25) as D2,sum(a.F30) as E2,sum(a.F33) as F2,sum(a.F10) as G2,sum(a.F37) as H2,sum(a.F42) as I2,sum(a.F48) as J2,sum(a.F49) as K2,sum(a.F50) as L2,sum(a.F52) as M2,sum(a.F53) as N2,sum(a.F54) as O2,sum(a.F55) as P2,sum(a.F56) as Q2,sum(a.F57) as R2,sum(a.F59) as S2,sum(a.F60) as T2,sum(a.F60) as U2,sum(a.F58) as V2, SUBSTR(a.*ZQH, 1, 2) as B2from isdzd_b0_10y_c awhere (a.HYDM IS NOT NULL) AND (a.BBQ_ like '2021%') AND (2>1)group by SUBSTR(a.*ZQH, 1, 2) bon (a.B2=b.B2) awhere (B2_sortn<=40)order by B3 nulls first,B2_sortn nulls first limit 1000;图表展现_单图单表select C2 as C2,D2 as D2,B2 as B2,B2_sortn as B2_sortnfrom (select sum(a.F15) as C2,sum(a.F33) as D2, SUBSTR(a.*ZQH, 1, 2) as B2,row_number() over ( order by sum(a.F33) desc nulls last) as B2_sortnfrom isdzd_b0_10y_c awhere (a.HYDM IS NOT NULL) AND (a.BBQ_ like '2021%') AND (2>1)group by SUBSTR(a.*ZQH, 1, 2) awhere (B2_sortn<=10)order by B2_sortn nulls first limit 1000;图表展现_双图双表select * from (select C2 as C2,D2 as D2,B2 as B2,B2_sortn as B2_sortnfrom (select sum(a.F15) as C2,sum(a.F33) as D2, SUBSTR(a.HYDM, 1, 1) as B2,row_number() over ( order by sum(a.F33) desc nulls last) as B2_sortnfrom isdzd_b0_10y_c awhere (a.HYDM IS NOT NULL) AND (a.BBQ_ like '2021%') AND (2>1)group by SUBSTR(a.HYDM, 1, 1) awhere (B2_sortn<=10)order by B2_sortn nulls first limit 1000 ) aunion allselect * from (select C2 as C2,D2 as D2,B2 as B2,B2_sortn as B2_sortnfrom (select sum(a.F15) as C2,sum(a.F33) as D2, SUBSTR(a.*ZQH, 1, 2) as B2,row_number() over ( order by sum(a.F33) as B2_sortnfrom isdzd_b0_10y_c awhere (a.HYDM IS NOT NULL) AND (a.BBQ_ like '2021%') AND (2>1)group by SUBSTR(a.*ZQH, 1, 2) awhere (B2_sortn<=10)order by B2_sortn nulls first limit 1000) b;统计报表下钻 _折叠展开select * from (select sum(a.F15) as B2,sum(a.F25) as C2,sum(a.F30) as D2,sum(a.F33) as E2,sum(a.F10) as F2,sum(a.F37) as G2,sum(a.F42) as H2,sum(a.F48) as I2,sum(a.F49) as J2,sum(a.F50) as K2,sum(a.F52) as L2,sum(a.F53) as M2,sum(a.F54) as N2,sum(a.F55) as O2,sum(a.F56) as P2,sum(a.F57) as Q2,sum(a.F59) as R2,sum(a.F60) as S2,sum(a.F60) as T2,sum(a.F58) as U2,'000000' as A2from isdzd_b0_10y_c awhere (a.HYDM IS NOT NULL) AND (a.BBQ_ like '2021%') AND (2>1) order by true limit 1000) aunion allselect * from (select sum(a.F15) as B2,sum(a.F25) as C2,sum(a.F30) as D2,sum(a.F33) as E2,sum(a.F10) as F2,sum(a.F37) as G2,sum(a.F42) as H2,sum(a.F48) as I2,sum(a.F49) as J2,sum(a.F50) as K2,sum(a.F52) as L2,sum(a.F53) as M2,sum(a.F54) as N2,sum(a.F55) as O2,sum(a.F56) as P2,sum(a.F57) as Q2,sum(a.F59) as R2,sum(a.F60) as S2,sum(a.F60) as T2,sum(a.F58) as U2, SUBSTR(a.*ZQH, 1, 2) as A2from isdzd_b0_10y_c awhere (a.HYDM IS NOT NULL) AND (a.BBQ_ like '2021%') AND (2>1) AND (2>1)group by SUBSTR(a.*ZQH, 1, 2) order by true limit 1000 ) b;统计报表下钻 _跳转select sum(a.F15) as C2,sum(a.F25) as D2,sum(a.F30) as E2,sum(a.F33) as F2,sum(a.F10) as G2,sum(a.F37) as H2,sum(a.F42) as I2,a.DJZCL* as B2,ma*(b.caption_) as mc_B20from isdzd_b0_10y_c aleft join DIM_2021NSSZLD_NSRDJZCL* bon (a.DJZCL*=b.id_)where (a.HYDM IS NOT NULL) AND (a.BBQ_ like '2021%') AND (2>1)group by a.DJZCL*order by sum(a.F33) desc nulls last limit 1000;图表联动select sum(a.F30) as B3,sum(a.F80) as C3,sum(a.F102) as D3, SUBSTR(a.*ZQH, 1, 2) as A3from isdzd_b0_10y_c awhere (a.HYDM IS NOT NULL) AND (a.BBQ_ like '2021%') AND (2>1)group by SUBSTR(a.*ZQH, 1, 2) order by true limit 1000;测试结果场景用例2亿10亿CarbonDataParquetCarbonDataParquet即席分析_1个维、1个指标19.3s17.3s27.8s25.1s即席分析_ 2个维、5个指标18.3s17.2s45.8s44.8s一般报表_1个维,5个指标10.9s13s13.3s42.9s一般报表_1个维,20个指标19.5s19.9s46s50s多级浮动_ 2个维,5个指标14.1s16.5s17.1s48s多级浮动_ 2个维,20个指标22.7s24.8s61.7s76.7s图表展现_单图单表14.9s15.8s22.2s18.9s图表展现_双图双表12.5s14.5s17.6s29.6s统计报表下钻 _折叠展开24.6s46.4s42.2s67.9s统计报表下钻 _跳转17.1s15.6s21.9s33.3s图表联动12.2s10.8s17.8s42.7s测试结论1. CarbonData与Parquet在全表扫描方式下的性能根本相当,没有明显差异。参考:用例1,22. 对于含过滤条件的筛选查询,当数据规模到一定量级时,CD较之Parquet有较明显的优势,主要是CarbonData的多级索引发挥了作用,而且数据量越大,性能差异越明显,从本次的测试用例可以看出,2亿行级的数据量两者差异很小,10亿行级的数据量两者的性能差距较明显。参考:用例3,5,113. CarbonData很好地利用了索引扫描,使得查询耗时的增加幅度明显低于数据量的膨胀速度,从测试结果我们不难发现,当数据量从2亿增长到10亿时,parquet格式的查询响应时间也成倍增加。而CarbonData格式的增加幅度相对平稳得多,这得益于索引扫描替代了全表扫描。参考:用例3,5,10,114. 含过滤条件的查询,假设使用开窗函数,CarbonData的性能优势不复存在,笔者推测是因为开窗函数通常需要走全表扫描,导致CarbonData的索引无法被利用参考:用例4,65. CarbonData上的union all操作比parquet要快,原因还在分析中。参考:用例8,96. 从实测结果来看,CarbonData存储格式适合BI的应用场景. z.

    注意事项

    本文(CarbonData在BI场景中的测试.doc)为本站会员(夺命阿水)主动上传,课桌文档仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。 若此文所含内容侵犯了您的版权或隐私,请立即通知课桌文档(点击联系客服),我们立即给予删除!

    温馨提示:如果因为网速或其他原因下载失败请重新下载,重复下载不扣分。




    备案号:宁ICP备20000045号-1

    经营许可证:宁B2-20210002

    宁公网安备 64010402000986号

    课桌文档
    收起
    展开