YNAO OpenIR  > 抚仙湖太阳观测和研究基地
太阳望远镜海量数据并行处理技术研究
其他题名Research on Technologies of Massive Data Parallel Processing for Solar Telescope
李雪宝
学位类型博士
导师王锋
2015-07-01
学位授予单位中国科学院研究生院
学位授予地点北京
学位专业天文技术与方法
关键词太阳望远镜 海量天文数据 高性能计算 高分辨重建
摘要一米新真空太阳望远镜(New Vacuum Solar Telescope,NVST)是我国重要的地基大口径太阳望远镜,主要目标是使用多通道高分辨成像系统,对太阳光球与色球进行高分辨成像观测。目前投入使用的高分辨成像系统通道主要有色球Halpha,光球TiO与G-band。3通道以10帧/秒的速度同时获取1024*1024像素或2560*2160像素图像,每个观测日产生至少7TB的海量数据。尽管数据存储设备价格在持续下降,但是如此大的数据量仍然显得难以传输和存储。为提高太阳观测图像的空间分辨率,需要对观测数据进行高分辨重建。NVST高分辨重建主要有2种,分别为选帧位移叠加(Level1)重建和斑点掩模(Level1+)重建,均采用100-200帧原始观测图像统计重建为1帧图像。在单台高端服务器上以串行方式重建1天单通道观测数据,Level1重建大概需要1天时间,Level1+重建大概至少需要3个月时间。处理速度极慢,远跟不上观测的产出,极大地影响了望远镜的运行效率与科学产出。鉴于NVST高分辨重建数据密集和计算密集,迫切需要对其进行并行加速。很明显,实时/准实时处理观测数据可以从以下两方面能够快速提高太阳望远镜的运行效率:(1)观测数据总量将以至少100倍速度快速地精简;(2)快速地缩短太阳观测与数据分析之间的时间。快速的计算机技术的发展,例如分布式并行计算技术,使得实时重建海量太阳观测图像成为可能。本论文研究使用分布式并行计算技术,并行加速NVST高分辨太阳图像重建,具体研究工作如下:(1)研究并行算法设计的方法以及高分辨重建算法,调研并研究目前流行的几种并行计算机以及并行编程模型,挖掘出适用于NVST高分辨图像重建的几种并行机与并行编程模型。(2)基于MPI选帧位移叠加(Level1)重建并行算法。针对NVST观测数据量巨大,以及Level1重建速度较慢两大特点,本文利用高性能集群与MPI技术加速Level1重建,系统地设计与实现了基于MPI的Level1重建并行算法,并成功构建一套Level1并行重建流水线系统,满足了Leve1实时重建迫切需求。Level1并行重建单帧1024*1024像素Halpha图像,整个过程相比于以前IDL实现获得23倍显著加速,并对各处理模块加速结果进行分析,提出进一步优化并行算法的技术与方法。本文还对基于MPI的Level1重建并行算法进行可扩展性测试,实验结果表明,该并行算法具有较好的可扩展性。(3)基于mpi斑点掩模(level1+)重建并行算法。针对nvst观测数据量巨大,以及level1+重建速度极慢两大特点,本文利用研究工作2的研究成果与关键技术,系统地设计与实现了基于mpi的level1+重建并行算法,并成功构建一套level1+并行重建流水线系统,满足了leve1+实时重建迫切需求,目前重建实时性能在国际同行处于领先水平。level1+并行重建单帧2560*2160像素tio图像,整个过程相比于以前idl实现获得约122倍显著加速,并对各处理模块加速结果进行分析,提出进一步优化并行算法的技术与方法。本文还对基于mpi的level1+重建并行算法进行可扩展性测试,实验结果表明,该并行算法具有较好的可扩展性。(4)基于openmp的子块图像重建并行算法。研究工作3基于mpi的level1+并行重建已经获得实时/准实时性能,但在相关硬件计算资源有限情况下,所有子块图像重建模块占据相当大时间比例,本文研究使用openmp多线程并行加速子块图像重建,系统地设计与实现了基于openmp的子块图像重建并行算法。本文使用openmp并行重建单帧tio通道256*256像素子块图像,相比于单线程cpu实现,整个过程获得约2.5倍明显加速,体现了本研究的并行化方法具有良好的加速性能。随着数据规模的增加,该并行算法表现出递增加速比性能,对于大规模数据处理,该并行算法加速更加有效。本文还对基于openmp子块图像重建并行算法进行可扩展性测试,实验结果表明,该并行算法具有较好的可扩展性。(5)基于mpi+openmp斑点掩模(level1+)重建并行算法。结合研究工作3与研究工作4的研究成果,将基于openmp子块图像重建并行算法移植至基于mpi的level1+重建并行算法。使用mpi+openmp混合编程模型,系统地设计与实现基于mpi+openmp的level1+重建并行算法,进一步提升level1+重建并行算法的加速性能。本文对比了基于mpi+openmp与基于mpi的两种不同编程模型的level1+重建时间结果。实验结果表明,在充足硬件计算资源情况下,基于mpi+openmp的level1+重建并行算法要快于基于纯mpi的level1+重建并行算法。(6)基于cuda的子块图像重建并行算法。本文研究使用gpgpu新并行技术加速子块图像重建。本文提出了基于cuda的子块图像重建新的并行算法,并系统地设计与实现该并行算法。本文使用gpgpu并行重建单帧tio通道256*256像素子块图像,相比于单线程cpu实现,整个过程获得约6倍明显加速,体现了本研究的并行化方法具有较好的加速性能。随着数据规模的增加,该并行算法表现出递增加速比性能,对于大规模数据处理,该并行算法加速更加有效。本论文系统地设计与实现了多套NVST高分辨重建并行算法,显著地提升了处理速度,并在此基础上,成功地构建多套高分辨并行重建流水线系统,满足了NVST海量数据实时处理的迫切需求。我国NVST重建实时性能超过了当前国际先进太阳望远镜NST(New Solar Telescope)的重建准实时性能。除此之外,本论文还优化了NVST高分辨重建并行算法,进一步提升了处理速度。本论文的研究成果不仅极大地提高NVST运行效率和科学产出,而且为下一代太阳望远镜(如CGST等)的数据处理系统提供较好的技术借鉴。
其他摘要The New Vacuum Solar Telescope (NVST), 1 meter large aperture ground-based solar telescope, has been built at the Full-shine Solar Observatory (FSO) of the Yunnan observatories in China in 2010. Its main scientific goal is to observe the fine structures in both the photosphere and the chromosphere. Multi-channel high resolution imaging system, with two photosphere channels and one chromosphere channel, has been installed and come into use at NVST. The band for observing the chromosphere is Halpha (6563?), and the band for observing the photosphere is TiO (7058?) and G-band (4300?), respectively. Because of its channel separation, the imaging system simultaneously acquires data using three detectors, capable of generating 2560*2160 or 1024*1024 pixels image data at a frame rate of around 10 images per second. When observing several hours per day, imaging system will produce at least 7TB of raw observational data. Although the storage cost is dropping continually, such huge data volume still becomes difficult to transfer and distribute. In order to improve spatial resolution of solar observational images, high resolution reconstruction techniques become indispensable. The raw data from NVST are reduced by lucky imaging (Level1) and speckle masking (Level1+) reconstruction. Both Level1 and Leve1l+ reconstruction techniques use at least 100 short exposure raw images to reconstruct one image statistically. Under an Interactive Data Language (IDL) implementation on a high-end computer, Level1 reconstruction of single channel data in a day took about one day, and Level1+ reconstruction took about at least three months. The processing speed was too slow to meet the urgent need of real-time high resolution imaging observations. To deal with both massive speckle data and large amount of computation in near real-time, it is indispensable to parallelize and accelerate high resolution reconstruction. Obviously, the (near) real-time data processing can improve the efficiency of the telescope, because not only the data volume is reduced by a factor of 100 rapidly, but also the time between the observations and data analyses is dramatically shorted. The rapid development of computer technology, e.g., the high performance computing technology, makes it possible to reconstruct massive speckle data in real-time on site. The thesis studies the key technologies of real-time image reconstruction through the distributed and parallel computing. The main contents of this thesis are as follows. (1) Study high resolution reconstruction algorithm and design method of parallel computing, investigate current popular several parallel computers and parallel programming models, and dig suitable parallel computers and parallel programming models for NVST. (2) Propose a parallel algorithm for Level1 reconstruction based on MPI (Message Passing Interface). Given massive speckle data and large amount of computation for Level1 reconstruction, we utilize high performance cluster and MPI technology to accelerate Level1 reconstruction. We systematically design and implement parallel algorithm for Level1 reconstruction based on MPI, and build a parallel processing pipeline for Level1 reconstruction to meet the urgent demand of real-time reconstruction. For reconstruction of one 1024*1024 pixels Halpha image, the whole data processing speed is about 23 times faster than that of IDL implementation. We analyze the speedup results between various modules to propose the technology and method for further optimizing parallel algorithm. We also test the scalability of parallel algorithm, and the results provide an excellent performance on scalability. (3) Propose a parallel algorithm for Level1+ reconstruction based on MPI. Given massive speckle data and huge amount of computation for Level1+ reconstruction, we take advantage of achievements and key technology of research work 2 to systematically design and implement parallel algorithm for Level1+ reconstruction based on MPI, and build a parallel processing pipeline for Level1+ reconstruction to meet the urgent demand of real-time reconstruction. The real-time performance is among the world leading level at present. For reconstruction of one 2560*2160 pixels TiO image, the whole data processing speed is about 122 times faster than that of IDL implementation. We analyze the speedup results between various modules to propose the technology and method for further optimizing parallel algorithm. We also test the scalability of parallel algorithm, and the results provide an excellent performance on scalability. (4) Propose a parallel algorithm for subimage reconstruction based on OpenMP. Although real-time performance of Leve1l+ reconstruction based on MPI has been achieved, considerable proportion of the computing time is spent in the reconstruction of all solar subimages because of the limitation of computing resources. We use OpenMP technology to accelerate subimage reconstruction, and systematically design and implement parallel algorithm for subimage reconstruction based on OpenMP. The time consumption of OpenMP-based implementation is compared with that of the single thread CPU implementation, and a significant speedup of around 2.5 is achieved to reconstruct one 256*256 pixels subimage. The parallel algorithm shows increased speedup performance with the increase of data size, thus becomes more efficient for large scale data processing. We also test the scalability of parallel algorithm, and the results provide an excellent performance on scalability. (5) Propose a parallel algorithm for Level1 reconstruction based on MPI+OpenMP. Combined with the achievements of research work 3 and research work 4, we systematically design and implement parallel algorithm based on MPI+OpenMP to further improve the speedup performance for Level1+ reconstruction. The time consumption of MPI+OpenMP-based implementation is compared with that of the MPI implementation. The results show that under sufficient hardware computing resources the speed of MPI+OpenMP-based implementation is faster than that of the MPI implementation. (6) Propose a parallel algorithm for subimage reconstruction based on Compute Unified Device Architecture (CUDA). We systematically design and implement a new parallel method for speckle masking reconstruction of solar subimage on General Purpose Graphics Processing Units (GPGPU) to accelerate subimage reconstruction. The time consumption of CUDA-based implementation is compared with that of the single thread CPU implementation, and a significant speedup of around 6 is achieved to reconstruct one 256*256 pixels subimage. The parallel algorithm shows increased speedup performance with the increase of data size, thus becomes more efficient for large scale data processing. The thesis systematically designs and implements multiple parallel algorithms for high resolution reconstruction for NVST, and significantly improves the processing speed. On this basis, we successfully build multiple parallel processing pipelines and meet the urgent demand of real-time reconstruction of massive data for NVST. The real-time performance of reconstruction for NVST exceeds the present near real-time performance for NST (New Solar Telescope). In addition, we optimize parallel algorithms for high resolution reconstruction for NVST, and further improve the processing speed. The research results of the thesis not only improve the efficiency and scientific output of the telescope, but also provide a better technical reference for the observation data handling system of next generation solar telescope (e.g., CGST, etc.).
学科领域天文学
语种中文
文献类型学位论文
条目标识符http://ir.ynao.ac.cn/handle/114a53/9454
专题抚仙湖太阳观测和研究基地
作者单位中国科学院云南天文台
推荐引用方式
GB/T 7714
李雪宝. 太阳望远镜海量数据并行处理技术研究[D]. 北京. 中国科学院研究生院,2015.
条目包含的文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
太阳望远镜海量数据并行处理技术研究.pd(4372KB)学位论文 开放获取CC BY-NC-SA浏览 请求全文
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[李雪宝]的文章
百度学术
百度学术中相似的文章
[李雪宝]的文章
必应学术
必应学术中相似的文章
[李雪宝]的文章
相关权益政策
暂无数据
收藏/分享
文件名: 太阳望远镜海量数据并行处理技术研究.pdf
格式: Adobe PDF
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。