华谱系统

确定要退出系统登录吗?
成果共享
  • 学术专著
  • 中文期刊论文
  • 外文期刊论文
  • 国际会议论文
  • 发明专利
  • 家谱数据集

1. 吴信东, 王祥丰, 金博, 于政, 吴明辉. 人机协同. 北京: 科学出版社, 2022,1-170.

2. 吴信东, 白婷, 张杰, 吴斌, 吴明辉. 知识图谱. 北京: 科学出版社, 2022,1-322.

1. 吴信东, 李娇, 周鹏, 卜晨阳. 碎片化家谱数据的融合技术, 软件学报, 32(9): 2816−2836, 2021.

2. 吴信东, 盛绍静, 蒋婷婷, 卜晨阳 , 吴明辉. 从知识图谱到数据中台: 华谱系统. 自动化学报, 46(10): 2045-2059, 2020.

3. 吴信东, 董丙冰, 堵新政, 杨威. 数据治理技术. 软件学报, 30(9): 2830-2856, 2019.

1. Xindong Wu, Tingting Jiang, Yi Zhu and Chenyang Bu. Knowledge Graph for China's Genealogy. IEEE Transactions on Knowledge and Data Engineering. 35(1): 634-646, 2023.

2. Guliu Liu, Lei Li, Guanfeng Liu, Xindong Wu. Social Group Query Based on Multi-Fuzzy-Constrained Strong Simulation. ACM Transactions on Knowledge Discovery from Data. 16(3), 54: 1–27, 2022.

3. Tingting Jiang, Chenyang Bu, Yi Zhu and Xindong Wu. Combining Embedding-Based and Symbol-Based Methods for Entity Alignment. Pattern Recognition, 124, 2022.

4. Shengwei Ji, Chenyang Bu, Lei Li, and Xindong Wu, Local Graph Edge Partitioning. ACM Transactions on Intelligent Systems and Technology. 12(5), 61: 1-25, 2021.

5. Bingbing Dong, Yi Zhu, Lei Li, Xingdong Wu. Hybrid Collaborative Recommendation of Co-Embedded Item Attributes and Graph Features. Neurocomputing, 442, 307-316, 2021.

6. Jiao Li, Chenyang Bu, Peipei Li, Xindong Wu. A Coarse-to-Fine Collective Entity Linking Method for Heterogeneous Information Networks, Knowledge-Based Systems, 228, 2021.

7. Jipeng Qiang, Xindong Wu. Unsupervised Statistical Text Simplification. IEEE Transactions on Knowledge and Data Engineering, 33(4): 1802-1806, 2019.

8. Jipeng Qiang, Ping Chen, Wei Ding, Fei Xie and Xindong Wu. Heterogeneous-Length Text Topic Modeling for Reader-Aware Multi-Document Summarization. ACM Transactions on Knowledge Discovery from Data. 13(4), 42: 1-21, 2019.

9. Peipei Li, Haixun Wang, Hongsong Li, and Xindong Wu, Employing Semantic Context for Sparse Information Extraction Assessment, ACM Transactions on Knowledge Discovery from Data, 12(5), 54: 1-36, 2018.

10. Xindong Wu, Hao Chen, Chenyang Bu, Shengwei Ji, Zan Zhang, Victor S. Sheng, HUSS: A Heuristic Method for Understanding the Semantic Structure of Spreadsheets, Data Intelligence, 5(3): 537–559, 2023.

1.Bingbing Dong, Zan Zhang, Jiao Li, Yi Zhu, Chenyang Bu and Xindong Wu, Hypernode: Entity Fusion for Data Traceability and Link Prediction. Proceedings of 2022 IEEE International Conference on Data Mining(ICDM), 111-120, 2022.

2. Ru Chen, Guliu Liu, Yi Zhu, Xindong Wu. A Scheme for Kinship Reasoning based on Ontology. IEEE International Conference on Big Knowledge, 102-109, 2021.

3. Jianxuan Shao, Chenyang Bu, Shengwei Ji, Xindong Wu. A Weak Supervision Approach with Adversarial Training for Named Entity Recognition. Pacific Rim International Conference on Artificial Intelligence. 17-30, 2021.

4. Xindong Wu, Tingting Jiang, Yi Zhu, Chenyang Bu. Knowledge Graph for China's Genealogy. Proceedings of 11th IEEE International Conference on Knowledge Graph, 529-535, 2020. ICKG-2020 Best Paper Award.

5. Xindong Wu, Shaojing Sheng, Peng Zhou. Balanced Tree Partitioning with Succinct Logic. Proceedings of 11th IEEE International Conference on Knowledge Graph, 552-559, 2020.

6. Guliu Liu and Lei Li. Knowledge Fragment Cleaning in a Genealogy Knowledge Graph. Proceedings of 11th IEEE International Conference on Knowledge Graph, 521-528, 2020.

7. Jipeng Qiang, Yun Li, Yi Zhu, Yunhao Yuan, Xindong Wu. Lexical Simplification with Pretrained Encoders. Proroceedings of the AAAI Conference on Artificial Intelligence, 34(05): 8649-8656, 2020.

8. Shengwei Ji, Chenyang Bu, Lei Li, Xindong Wu. Local Graph Edge Partitioning with a Two-Stage Heuristic Method. In: Proceedings of the 39th IEEE International Conference on Distributed Computing Systems, 228-237, 2019.

9. Xindong Wu, Jia Wu, Xiaoyi Fu, Jiachen Li, Peng Zhou, Xu Jiang. Automatic Knowledge Graph Construction: A Report on the 2019 ICDM/ICBK Contest. Proroceedings of IEEE International Conference on Data Mining, 1540-1545, 2019.

10. Shaojing. Sheng, P. Zhou, X. Wu. CEPV: A Tree Structure Information Extraction and Visualization Tool for Big Knowledge Graph. In: Proceedings of 9th International Conference on Big Knowledge, 221-228, 2018.

11. Bingbing Dong, Zan Zhang, Jiao Li, Yi Zhu, Chenyang Bu and Xindong Wu. Hypernode: Entity Fusion for Data Traceability and Link Prediction. Proceedings of 2022 IEEE International Conference on Data Mining (ICDM), 111-120, 2022.

1. 张赞,张哲,盛绍静,吴信东. “基于坐标信息的家谱树展示方法及装置、电子设备“. 发明专利, 专利号: ZL 2022 1 0826051.4.

2. 吴信东,盛绍静,刘古刘,张赞. “家谱知识图谱的噪音检测方法及其装置、电子设备”. 发明专利, 专利号:ZL 2022 1 0082551.1.

3. 吴信东,洪炎,卜晨阳. “基于规则信息的高质量噪音检测方法与装置“. 发明专利, 专利号: ZL 2022 1 0135548.1.

4. 吴信东,盛绍静,卜晨阳. “家谱分卷方法及装置、电子设备“. 发明专利, 专利号: ZL 2021 1 1095570.X.

5. 吴信东,陈浩,卜晨阳. “家谱登记表的信息抽取方法及其装置、电子设备“. 发明专利, 专利号: ZL 2021 1 0888402.X.

6. 吴信东, 盛绍静,周鹏,卜晨阳. “文本处理方法、装置、非易失性存储介质及处理器“, 发明专利, 专利号:ZL 2021 1 0456229.6.

7. 吴信东, 赵海霞,李磊,卜晨阳. “转换字符的方法及装置“, 发明专利, 专利号:ZL 2021 1 0378904.8.

8. 吴信东, 盛绍静,周鹏,卜晨阳. “家谱数据的处理方法及装置“, 发明专利, 专利号:ZL 2021 1 0251319.1.

9. 吴信东, 刘古刘, 李磊. “同名人物的识别处理方法及处理装置“, 发明专利, 专利号:ZL 2020 1 0167476.X.

10. 吴信东,蒋婷婷,卜晨阳. “权限控制方法及装置“, 发明专利, 专利号:ZL 2020 1 1027179.1.

11. 吴信东,盛绍静,卜晨阳,周鹏. “家谱打印方法及装置“, 发明专利, 专利号:ZL 2019 1 1167599.7.

12. 吴信东,钟凌峰,朱毅. “家谱的识别方法及装置、存储介质、处理器“, 发明专利, 专利号:ZL 2019 1 1067405.6.

13. 吴信东,李娇,周鹏. “家谱数据的处理方法及装置,处理器“, 发明专利, 专利号:ZL 2019 1 0640336.7.

14. 吴信东,董丙冰,朱毅. “数据集成方法及装置”, 发明专利, 专利号:ZL 2019 1 0528294.8.

15. 吴信东,蒋婷婷,卜晨阳,李磊,刘啸剑. “一种针对家谱人物属性名称的融合方法”, 发明专利, 专利号:ZL 2018 1 0990234.3.

16. 李磊,刘古刘,吴共庆,吴信东. “一种基于区块链的智能合约的众包构建方法“, 发明专利, 专利号:ZL 2017 1 0104393.4.

家谱数据集:Genealogy-MBW

1.  数据集信息


为实现同家谱文化和图数据研究者共享,华谱系统现提供数据集:Genealogy-MBW,包含中英文版数据介绍,数据集使用文档和数据集(节点和边)文件,下载链接如下。

2.  文档下载


3.  Genealogy-MBW中文版数据集描述


3.1 简介

Genealogy-MBW来源于华谱系统(https://www.zhonghuapu.com/),是一份真实的家谱数据集,这份家谱以元代大德七年(1303年)徽州府婺源县出生、明朝搬迁到安庆府桐城县(现枞阳县)的一位吴姓祖宗开始(数据集里编号为0),记录了至2020年12月最近一次修谱竣工时他的所有后代数据,包括23646位有具体名字且属于这位吴姓的直系后代。为便于同家谱文化和图数据研究者共享,华谱系统提供此份仅包含单独立世人物的家谱数据,隐私信息已匿名处理。

华谱系统的家谱图数据库中,节点表示家谱人物,边表示人物之间的关系,节点的属性提供人物的描述信息。如图1所示,0、1、7323为三个节点,1、7323分别是0的儿子和隔16代的男性后代。至2022年7月,华谱系统已存储超过1855万人物节点和1130份家谱。

3.2 详细信息

表1是Genealogy-MBW数据集的详细信息,该数据集包含两个文件:Genealogy-MBW-nodes.txt文件提供关于节点全部描述信息,Genealogy-MBW-edges.txt提供关于边的信息,详细描述见图2、图3。

表2 是节点属性、属性值和特性的介绍,图2是Genealogy-MBW-nodes.txt文件中节点的介绍,其中属性值为空时表示该属性不适用。例如,“0,male,2,,1,,”表示id为0的男性人物,其家庭排行为2,世辈为1,过继家庭排行和隔代相连值不适用此人物;“8477,male,3,1,18,,”表示id为8477的男性人物,其家庭排行为3,过继家庭排行为1,世辈为18,隔代相连值不适用此人物。

图3是Genealogy-MBW-edges.txt 文件的介绍,其中的关系类型中英文对照表给在了表3中。表中隔代后代是一种特殊关系,它表示家谱中仅记录有某一人物的辈分和已知的最近直系先祖,而从这位直系先祖至该人物的中间人物均无法考证。为最大化保存家谱完整性,华谱系统中设计了这个新型关系来连接此人物和其直系先祖,在节点隔代相连值属性中保存其与直系先祖隔代值。

1. 家谱分卷源码下载

为促进家谱分卷算法研究,华谱系统现提供家谱算法TPA源码,下载链接如下:TPA_CODE

2. 运行

2.1 在华谱系统【家谱数据集】模块下载家谱数据集:Genealogy-MBW;

2.2 修改main.py文件中文件路径,运行main.py文件即可完成分卷工作。

  • 中文期刊论文
  • 外文期刊论文
  • 国际会议论文
  • 发明专利

吴信东, 李娇, 周鹏, 卜晨阳. 碎片化家谱数据的融合技术. 软件学报. doi: 10.13328/j.cnki.jos.006010.

2020-12-08

吴信东, 盛绍静, 蒋婷婷, 卜晨阳 , 吴明辉. 从知识图谱到数据中台: 华谱系统. 自动化学报, 46(10), 2045-2059,2020.

2020-12-08

吴信东, 董丙冰, 堵新政, 杨威. 数据治理技术. 软件学报, 30(9): 2830-2856, 2019..

2020-12-08

Peipei Li, Haixun Wang, Hongsong Li, and Xindong Wu, Employing Semantic Context for Sparse Information Extraction Assessment , ACM Transactions on Knowledge Discovery from Data, 12(5): 54:1-36, July 2018.

2020-12-08

Jipeng Qiang, Xindong Wu. Unsupervised Statistical Text Simplification. IEEE Transactions on Knowledge and Data Engineering. DOI: 10.1109/TKDE.2019.2947679, in press, 2019.

2020-12-08

Xindong Wu, Tingting Jiang, Yi Zhu, and Chenyang Bu. Knowledge Graph for China's Genealogy. Proceedings of 11th IEEE International Conference on Knowledge Graph (ICKG-2020). IEEE, 2020: 529-535. ICKG-2020 Best Paper Award.

2020-12-08

Xindong Wu, Shaojing Sheng, and Peng Zhou. Balanced Tree Partitioning with Succinct Logic. Proceedings of 11th IEEE International Conference on Knowledge Graph (ICKG-2020). IEEE, 2020: 552-559.

2020-12-08

Guliu Liu, Lei Li. Knowledge Fragment Cleaning in a Genealogy Knowledge Graph. Proceedings of 11th IEEE International Conference on Knowledge Graph (ICKG-2020) August 9-11, 2020, Nanjing, China.

2020-12-08

Shengwei Ji, Chenyang Bu, Lei Li, and Xindong Wu. Local Graph Edge Partitioning with a Two-Stage Heuristic Method. In: Proceedings of the 39th IEEE International Conference on Distributed Computing Systems (ICDCS 2019), pp. 228-237, 2019.

2020-12-08

Shaojing. Sheng, P. Zhou , and X. Wu. CEPV: A Tree Structure Information Extraction and Visualization Tool for Big Knowledge Graph. In: Proceedings of 9th International Conference on Big Knowledge (ICBK 2018), pp. 221-228, 2018.

2020-12-08

Bingbing Dong, Zan Zhang, Jiao Li, Chenyang Bu and Xindong Wu. Hypernode: Entity Fusion for Data Traceability and Link Prediction. Proceedings of 2022 IEEE International Conference on Data Mining (ICDM), 111-120, 2022.

2023-02-06

1.吴信东,蒋婷婷,卜晨阳,李磊,刘啸剑。“一种针对家谱人物属性名称的融合方法”、发明专利、已授权(授权公告号:CN 109284393 B)

2020-12-08