YNAO OpenIR  > 其他
明安图射电频谱日像仪海量数据处理方法研究
其他题名Research on Massive Data Processing Methods for Mingantu Spectral Radioheliograph
梅盈
学位类型博士
导师王锋
2018-07-01
学位授予单位中国科学院大学
学位授予地点北京
学位专业天文技术与方法
关键词海量数据 射电干涉成像 数据标记 高性能
摘要射电天文是现代天文研究的一个重要领域,随着大型射电望远镜的不断建成,射电观测为实现人类的许多科学目标提供了巨大的可能。射电望远镜海量复杂数据的高性能处理和分析是新世纪天文学研究的重要内容。明安图射电频谱日像仪(MingantU SpEctral Radioheliograph, MUSER)是中国新一代厘米-分米波综合孔径望远镜,在400MHz ~ 15GHz频率范围内同时以高时间、高空间和高频率分辨率对太阳进行射电频谱成像,所获得的海量观测数据给高性能实时以及事后数据处理带来了巨大的挑战。本文针对MUSER数据处理流水线的自动化流程和数据处理性能开展研究,工作贯穿从数据预处理到成像的整个流程,旨在突破MUSER成像质量以及计算性能等关键技术,具体工作说明如下:(1)在数据预处理阶段,进一步深化了基于机器学习的异常数据自动标记技术研究。本研究通过对原始观测数据的分析,基于支持向量机和循环神经网络的方法实现了MUSER数据处理系统异常数据自动化标记,取得了较高的标记准确率,解决了前期异常标记过于依靠人工记录的问题,为数据处理流水线的成图质量提供了保障;(2)初步实现了一个完整的MUSER数据处理流水线设计。给出了MUSER UVFITS格式定义,为开展数据交换打下了基础。同时,系统地研究了相位校准、UVW计算及观测数据积分方法,有效提高了最终成像质量;(3)进一步研究了MUSER高性能成图算法,基于GPU实现了包含权重、网格化以及洁化算法。研究了适用于MUSER的混合洁化算法以及基于多尺度带通滤波的并行洁化算法,在成像质量和算法性能两方面都得到了提升;(4)实现了当前阶段系统相位误差的改正及CLEAN算法迭代次数的估计。由于当前唯一可用校准源跟踪精度不足等问题,成像得到的原始脏图中太阳日面偏离图像中心。文章通过相位相关的方法估算了相位误差参数,并利用估计出的太阳日面和天空背景亮度,确定了洁化算法迭代阈值,实现了对算法的改进,克服了迭代次数依赖经验的问题;(5)MUSER高性能数据处理的实现。给出了算法的GPU实现及成图性能分析,实现了将整个数据处理流水线集成到项目组为天文领域高性能数据处理设计的分布式计算框架—OpenCluster。同时,基于Python开发了MUSER数据处理命令行系统。当前实现的数据处理系统可满足实时以及事后处理需求。本文的工作进一步完善并最终构建了一套高性能的MUSER数据处理系统,为充分发挥MUSER的性能优势,提高后续科研产出打下了基础。同时,在中国参与国际科技创新合作项目——世界上最大的射电望远镜阵列建设(Square Kilometre Array, SKA)的背景下,MUSER可作为SKA太阳观测有效的实验环境。当前针对海量数据处理的研究方法也为未来开展SKA相关工作打下了良好的基础。
其他摘要Radio astronomy forms an important part of modern astronomy. With the construction of large radio telescopes, radio observation provides a great possibility for the realization of many scientific goals of mankind. The high-performance processing and analysis of the massive complex data of radio telescope is an essential part of astronomy research in the new age.MingantU SpEctral Radioheliograph (MUSER) is a new generation cm-decimeter aperture telescope in China. MUSER is a dedicated solar radio interferometer, imaging the sun in the frequency range of 400 MHz ~ 15 GHz with high time, high space and high frequency resolution almost simultaneously. The massive observational data brings great challenge to high-performance real-time and post data processing.This dissertation focus on the automation process and data processing performance of the MUSER. The work of runs through the whole process of data preprocessing to imaging, aiming at achieving breakthroughs in the quality of MUSER imaging and computing performance. The detailed work description is as follows:(1) In the data preprocessing stage, further researches are carried out on automatic abnormal data flagging based on machine learning. Through the analysis of the original observation data, this study realize the automatic abnormal data flagging in MUSER data processing system by the machine learning methods of support vector machine (SVM) and recurrent neural network (RNN). This work has achieved high precision rate, which solves the problem of the flagging process over-reliance on artificial records in the previous work and provides a guarantee for imaging quality.(2) A complete MUSER data processing pipeline is preliminary designed. The format definition of MUSER UVFITS is given, which lays a foundation for data exchange. At the same time, the phase calibration, UVW calculation and observation data integration methods are systematically studied to improve the final image quality.(3) A further study of MUSER high-performance imaging algorithms is presented and algorithms including weight, gridding and CLEAN are implemented based on GPU. A hybrid CLEAN algorithm for MUSER and a parallel cleaning algorithm based on multi-scale bandpass filtering are studied, resulting in improvement both on imaging quality and performance.(4) Correction of the system phase error in the current stage and estimation of the number of iterations for clean algorithms. Owing to the tracking accuracy the only available calibration source in the current stage, the solar disk in the original dirty map deviates from the image center. Through the analysis of the dirty map, the paper realizes the detection of the solar disk and sky brightness, and then calculates the deviation parameter to correct the Phase error. Improvement of the CLEAN algorithm by using the estimated sky brightness as the iterative threshold overcomes the problem that determination of the number of iterations depends on experience.(5) Implementation of the MUSER high-performance data processing pipeline. This paper gives the GPU implementation and the performance analysis of the algorithms. Furthermore, the whole data processing pipeline is integrated into the distributed computing framework—OpenCluster, which is designed for high performance data processing in astronomy. At the same time, the MUSER data processing command line system is developed based on Python. The currently implemented system can meet real-time and post-processing requirements.The work of this dissertation further improved and finally constructed a high-performance MUSER data processing system, which lays a foundation for the full play of MUSER's performance advantages and improving the follow-up research outputs. At the same time, in the context of China's participation in the international science and technology innovation cooperation project— the world's largest radio telescope SKA (Square kilometre Array), MUSER can be used as an effective experimental environment for SKA solar observation. Moreover, the current study of mass data processing methods lays a good foundation for the future development of SKA.
学科领域天文学 ; 射电天文学 ; 射电天文方法 ; 计算机科学技术 ; 计算机应用
学科门类理学 ; 理学::天文学 ; 工学 ; 工学::计算机科学与技术(可授工学、理学学位)
页数120
语种中文
文献类型学位论文
条目标识符http://ir.ynao.ac.cn/handle/114a53/25415
专题其他
作者单位中国科学院云南天文台
第一作者单位中国科学院云南天文台
推荐引用方式
GB/T 7714
梅盈. 明安图射电频谱日像仪海量数据处理方法研究[D]. 北京. 中国科学院大学,2018.
条目包含的文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
明安图射电频谱日像仪海量数据处理方法研究(28969KB)学位论文 开放获取CC BY-NC-SA浏览 请求全文
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[梅盈]的文章
百度学术
百度学术中相似的文章
[梅盈]的文章
必应学术
必应学术中相似的文章
[梅盈]的文章
相关权益政策
暂无数据
收藏/分享
文件名: 明安图射电频谱日像仪海量数据处理方法研究.pdf
格式: Adobe PDF
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。