其他摘要 | With the advent of large astronomical equipment in recent years, the astronomical data presents an explosive growth, and each astronomical telescope may be equipped with multiple terminals. Therefore, how to rapidly deploy the pipeline and process the original observation data of multi-band at high speed is a hot spot in the field of astronomical data processing. Based on the traditional data pipeline, a general astronomical data pipeline development framework based on container and microservice is proposed in this paper. Using this framework, the data processing pipeline of the 1m new vacuum solar telescope (nvst) of Yunnan Observatory and the optical and near infrared solar burst detector (onset) of Nanjing University were developed. The main innovative research work of this paper includes two parts: • Introduce high performance container virtualization technology : singularity into the development of astronomical data pipeline. In the actual project, the performance of singularity is evaluated, and it is found that the performance cost of singularity is very small, almost negligible (less than 5%). The introduction of container technology solves the problem of software dependence in the environment, makes the developed pipeline run in any high-performance computing environment, and greatly improves the portability of the pipeline. At the same time, because the container has a complete pipeline environment, it makes online debugging easier. •Develop a pipeline model with a microservoire concept that combines high performance, flexibility, and portability.Flexibility and portability: the pipeline decoupling standard is defined and the pipeline is decoupled. The message queuing network library zeromq is used to realize service discovery and registration. After decoupling, each microservice in the pipeline has a single function and a clear input and output. The pipeline topology is defined by configuration file, and the parsing of configuration file and one click deployment of pipeline are realized.This mode is very suitable for the development of multi-terminal and multi-functional astronomical data pipeline, which increases the reusability of the program to the greatest extent, reduces the repeated software development,and can build a scientific data processing pipeline in the shortest time when new equipment is put into operation. In terms of high performance, we propose two container resource expansion and scheduling algorithms, which can provide peripheral acceleration function for the pipeline when the resource (GPU、CPUs) utilization is insufficient. In addition, the data pipeline based on singularity can support high-performance scenarios such as MPI, GPU and IB network. In conclusion, based on the concept of containerization technology and microservice, this paper proposes a general framework of astronomical data pipeline, and develops the corresponding data pipeline for the telescope of Fuxianhu observatory. The engineering practice proves the feasibility of this development method, and considers that this scheme is feasible for the current development of multi-channel astronomical data pipeline. |
修改评论