YNAO OpenIR  > 南方基地
机器学习在相接双星研究中的应用
其他题名Application of machine learning to the study of contact binaries
丁旭
学位类型博士
导师季凯帆
2022-07-01
学位授予单位中国科学院大学
学位授予地点北京
培养单位中国科学院云南天文台
学位专业天文技术与方法
关键词机器学习 相接双星 疏散星团 统计分析
摘要相接双星是密近双星中的一种,其两颗子星拥有公共包层,子星之间具有强烈的物质交流和相互作用,它的形成和演化至今仍然是一个值得深入研究的课题。随着各种巡天望远镜(如空间Kepler和TESS望远镜 ,地面ZTF望远镜)的开展,高达百万条的光变曲线被释放,其中相接双星的光变曲线也达到数十万。但是如何对这些海量数据进行识别、分类,以及参数解轨明显已经成为一个瓶颈问题。如采用经典的WD或者Phoebe程序对光变曲线进行参数解轨,一个目标的解轨时间就需要长达数小时及数天的时间,这显然是难以承受的。基于这种情况,本论文的核心内容就是将机器学习应用到相接双星光变曲线的数据处理中,包括识别、分类和解轨。我们不但建立起一套完整的基于机器学习的相接双星光变曲线处理方法和流程,而且应用到TESS和ZTF的观测数据中,分别产生了对应的相接双星参数星表。同时,我们也发展了一套疏散星团中的成员星的判定方法,并对成员星中的相接双星进行了解轨。本文的工作包括如下内容:1. 提出了一种基于LSSNR的周期性变星识别方法。该方法能够较准确的识别周期性变星,并给出相对准确的周期。在对TESS望远镜所释放的数据1-43个扇区的光变曲线数据进行周期性变星的搜寻,总共获得了26206颗周期性变星。2. 建立了基于神经网络的周期性变星分类模型。通过对光变曲线进行傅里叶变换,选取低频项的振幅和变星周期作为特征,实现数据降维并简化神经网络模型,最终F1-score平均值为96%。将这一方法应用在TESS巡天数据中,其中87.9%的目标的分类概率大于90%。3. 提出并实现了基于神经网络和MCMC的快速批量解轨方法。首先使用Phoebe产生的数十万条理论光变曲线作为训练样本,分别训练了从光变曲线到和对应的参数之间的回归神经网络(反向模型)和从参数到光变曲线的回归神经网络(正向模型);然后用反向模型的预测值作为初值,实现了联合正向模型和MCMC的快速批量解轨。最终得到的相接双星参数包括质量比、轨道倾角、温度比、相接度和第三光比例,以及给出参数的误差范围。使用这套方法进行解轨,计算效率比以前提高4个量级,在普通的I7 CPU计算机上二十秒钟就可以求解一颗目标,从而使得海量相接双星解轨成为可能。我们使用这套方法对Kepler望远镜释放的相接双星数据进行了解轨,并和前人结果对比, 证明了从精度上表明本方法也有非常好的表现。4. 基于我们的方法,针对TESS空间望远镜数据和ZTF地面望远镜数据特点分别建模,对他们的观测数据进行了相接双星的识别、分类和参数解轨,最后再用Phoebe程序产生相应的光变曲线和观测数据进行比对以验证求解的准确性。最终产生了TESS相接双星参数星表(699颗)和ZTF相接双星参数星表(86365颗)。最后对参数分布也进行了初步的统计,并和前人在小样本上的结论进行了比较。5.发展了基于DBSCAN的疏散星团成员星判定方法,并将新的解轨方法应用在星团成员星中的相接双星。使用30个星团数据对该方法进行验证,结果表明本判定方法能够获得更多可靠的星团成员星,探测的G星等达到21等左右。对NGC 6791星团成员星中的相接双星进行识别,获得了4颗EW型的光变曲线,并且对其中一颗相接双星进行解轨。本文应用了机器学习的诸多方法(包括无监督学习的降维和聚类,有监督学习的分类和回归)到相接双星的识别、分类和解轨以及星团成员星的判定中,提出了一套较为完整的海量相接双星光变曲线的处理方法并得到了一些相应的科学结果。这些方法也对处理其他类型双星提供借鉴作用。
其他摘要The formation and evolution of contact binaries, one binary with a common envelope and strong material exchange and interaction between its two daughter stars, is still a subject of intensive study. With various survey projects (Kepler and TESS telescopes in space, ZTF telescopes on the ground), millions of light curves have been released, including hundreds of thousands of light curves for contact binaries. But how to identify, classify, and deriving the parameters of these huge amounts of data has clearly become a bottleneck problem. If the classical WD or Phoebe procedures is used for deriving the parameters of contact binary, the time of deriving parameters of one contact binary can take hours and days, which is clearly unaffordable. In this paper, the core of this thesis is the application of machine learning to the data processing of light curves of contact binaries, including identification, classification and deriving the parameters of contact binary. We have not only developed a complete machine learning-based method and process for processing the light curves of contact binaries, but also applied it to the TESS and ZTF observations to produce the corresponding parametric catalogues of contact binaries. We have also developed a method for the discrimination of member stars in open clusters and for deriving the parameters of contact binaries.The work in this paper includes the following.1. A method for identifying periodic variable stars based on LSSNR is proposed. The method is able to identify the periodic variable stars more accurately and give relatively accurate periods. After searching the light curve data of sectors 1-43 of the data released by the TESS telescope for periodic variable stars, a total of 26,206 periodic variable stars were obtained.2. A neural network-based classification model for periodic variable stars was developed. By Fourier transforming the light curves and selecting the amplitude and variable star period of the low-frequency terms as features, the data were downscaled and the neural network model was simplified, with the final F1-score average greater than 96%. Applying this method to the TESS survey data, the classification accuracy of 87.9% of these targets is greater than 90%.3. A fast batch of deriving the parameters of contact binaries method based on neural networks and MCMC is proposed and implemented. The regression neural network from the light curve to the corresponding parameter (inverse model) and from the parameter to the light curve (forward model) were trained using millions of theoretical light curves generated by Phoebe as training samples. The final parameters obtained for the contact binary include mass ratio, orbital inclination, temperature ratio, fill-out factor and third light ratio, as well as error ranges for the parameters. Using this method for deriving the parameters of contact binaries is four orders of magnitude more computationally efficient than before, allowing a target to be solved in twenty seconds, thus making it possible to derive the parameters of a large number of contact binaries. We have used this method to derail data released by the Kepler telescope and compared it with previous results, demonstrating that the method also performs very well in terms of accuracy.4. Based on our method, the TESS space telescope data and the ZTF ground-based telescope data are modelled separately, and their observations are identified, classified and deriving the parameters and finally the corresponding light curves are produced by the Phoebe program for comparison with the observations to verify the accuracy of the solution. The resulting parametric lists of TESS-joined binaries (699) and ZTF-joined binaries (86365) were produced. Finally, preliminary statistics on the parameter distributions are also presented and compared with previous findings on small samples.5. A DBSCAN-based method for the determination of open cluster members is developed, and a new method of deriving the parameters of contact binaries is applied to the cluster members. The method is validated with data from 30 open clusters, and the results show that the method is able to obtain more reliable cluster members, with detections up to about 21st magnitude. The identification of the contact binaries in the NGC 6791 cluster was performed, and the light curves of four EW-type stars were obtained, and the parameters of one contact binaries was derived by the method.In this paper, a number of machine learning methods (including unsupervised learning for dimensionality reduction and clustering, and supervised learning for classification and regression) are applied to the identification, classification and deriving the parameters of contact binaries, as well as to the identification of the cluster members, and a more complete set of methods for processing the light curves of massive contact binaries is proposed and some corresponding scientific results are obtained. These methods are also useful for the treatment of other types of binaries.
学科领域天文学 ; 恒星与银河系 ; 计算机科学技术 ; 人工智能 ; 计算机应用
学科门类理学 ; 理学::天文学 ; 工学 ; 工学::计算机科学与技术(可授工学、理学学位)
页数0
语种中文
文献类型学位论文
条目标识符http://ir.ynao.ac.cn/handle/114a53/25786
专题南方基地
推荐引用方式
GB/T 7714
丁旭. 机器学习在相接双星研究中的应用[D]. 北京. 中国科学院大学,2022.
条目包含的文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
机器学习在相接双星研究中的应用.pdf(11844KB)学位论文 开放获取CC BY-NC-SA浏览 请求全文
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[丁旭]的文章
百度学术
百度学术中相似的文章
[丁旭]的文章
必应学术
必应学术中相似的文章
[丁旭]的文章
相关权益政策
暂无数据
收藏/分享
文件名: 机器学习在相接双星研究中的应用.pdf
格式: Adobe PDF
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。