[an error occurred while processing this directive]

Table of Content

    20 January 2023, Volume 59 Issue 1
    Word-Based Domain Feature-Sensitive Multi-domain Neural Machine Translation
    HUANG Zengcheng, MAN Zhibo, ZHANG Yujie, XU Jin’an, CHEN Yufeng
    2023, 59(1):  1-10.  DOI: 10.13209/j.0479-8023.2022.063
    Asbtract ( )   HTML   PDF (802KB) ( )  
    Related Articles | Metrics
    The accuracy of the existing word-based domain feature learning methods on domain discrimination is still low and the further research for domain feature learning is required. In order to improve domain discrimination and provide accurate translation, this paper proposes a word-based domain feature-sensitive learning mechanism, including 1) the context feature encoding at encoder side, to widen the study range of word-based domain features, introducing convolutional neural networks (CNN) in encoder for extracting features from word strings with different lengths in parallel as word context features; and 2) enhanced domain feature learning. A domain discriminator module based on multi-layer perceptions (MLP) is designed to enhance the learning ability of obtaining more accurate domain proportion from word context features and improve the accuracy of word domain discrimination. Experiments on English-Chinese task of UM-Corpus and English-French task of OPUS show that the average BLEU scores of the proposed method exceed the strong baseline by 0.82 and 1.06 respectively. The accuracy of domain discrimination is improved by 10.07% and 18.06% compared with the baseline. More studies illustrate that the improvements of average BLEU scores and accuracy of domain discrimation are contributed by the proposed word-based domain feature-sensitive learning mechanism.
    English Books Automatic Classification According to CLC
    JIANG Yanting
    2023, 59(1):  11-20.  DOI: 10.13209/j.0479-8023.2022.070
    Asbtract ( )   HTML   PDF (797KB) ( )  
    Related Articles | Metrics
    Faced with lacking of English books annotated with CLC (Chinese Library Classification) label and imbalance data, this paper combines augmentation strategies from library, information and general fields: 1) classification mapping from Library of Congress Classification (LCC) to CLC; 2) semantic enhancement based on Chinese-English parallel thesaurus; 3) punctuation or 4) conjunction inserting to initial texts. Experiments show that combining 4 strategies can optimize the performance of models on test set. Accuracy and Macro-F1 respectively increase by 3.61 and 3.35 percentage points. Comprehensive methods is superior to other text enhancement strategies. By BERT word embeddings visualization and words information entropy computing, this paper inferred that the reason why punctuation or conjunction inserting works was the various adjacent words and connection function in grammar.
    A Multi-information Perception Based Method for Question Answering in Multi-party Conversation
    GAO Xiaoqian, ZHOU Xiabing, ZHANG Min
    2023, 59(1):  21-29.  DOI: 10.13209/j.0479-8023.2022.069
    Asbtract ( )   HTML   PDF (750KB) ( )  
    Related Articles | Metrics
    Question answering in multi-party conversation typically focuses on exploring discourse structures or speaker-aware information but ignores the interaction between questions and conversations. To solve this problem, a new model which integrates various information is proposed. In detail, to hierarchically model the discourse structures, speaker-aware dependency of interlocutors and question-context information, the proposed model leverages above information to propagate contextual information, by exploiting graph convolutional neural network. Besides, the model employs a reasonable interaction layer based on attention mechanism to enhance the understanding of multi-party conversations by selecting more helpful information. Furthermore, the model is the first to pay attention to the explicit interaction between question and context. The experimental results show that the model outperforms multiple baselines, illustrating that the model can understand the conversations more comprehensively.
    A Short Text Matching Model Incorporating Contextual Semantic Differences
    ZHANG Wenhui, WANG Meiling, HOU Zhirong
    2023, 59(1):  30-38.  DOI: 10.13209/j.0479-8023.2022.071
    Asbtract ( )   HTML   PDF (1580KB) ( )  
    Related Articles | Metrics
    Short text matching is often unable to accurately obtain the degree of semantic similarity between sentences when the semantic difference of the same wording and the semantic equivalence of the different wording. To solve this problem, the paper proposes a short text matching model which integrates contextual semantic differences. In this model, language models from the BERT series are utilized as a basic matching model, a novel Diff Transformer structure is implemented for extracting difference feature, and a gate mechanism is applied to integrate basic semantic representations and difference feature for a better matching effect. The model achieves the effect of advanced models on Chinese test datasets.
    Document Constrained Translation Quality Estimation Model
    FENG Qin, GONG Zhengxian, YE Heng, ZHOU Guodong
    2023, 59(1):  39-47.  DOI: 10.13209/j.0479-8023.2022.067
    Asbtract ( )   HTML   PDF (941KB) ( )  
    Related Articles | Metrics
    This paper proposes a new translation quality estimation model that does not rely on the reference translation to score the translation of each sentence in the source language. The authors model the sentence-level semantic difference and word-level referential difference between the source and translation and design additional loss function to make the model constrain the differences as much as possible when predicting scores. The experimental results show that proposed method can effectively improve the performance of quality estimation model. Compared with the baseline system, the proposed method improves the Pearson correlation coefficient by up to 6.68 percentage points.
    Difference between Multi-modal vs. Text Pre-trained Models in Embedding Text
    SUN Yuchong, CHENG Xiwei, SONG Ruihua, CHE Wanxiang, LU Zhiwu, WEN Jirong
    2023, 59(1):  48-56.  DOI: 10.13209/j.0479-8023.2022.074
    Asbtract ( )   HTML   PDF (2107KB) ( )  
    Related Articles | Metrics
    This paper provides quantitative comparison between the text embedding of a text pre-trained model (i.e., RoBERTa) and a multi-modal pre-trained model (i.e., WenLan). Two quantitative comparison methods are proposed, in an embedding space: representing the semantics of a word using the set of ?-nearest words to it and then analyze the semantic changes of the word in the two spaces using the Jaccard similarity of the two sets; forming pairs between each word and its nearest ? words to analyze the relationship. The results show that the multi-modal pre-training brings more semantic changes for more abstract words (e.g., success, love), and the multi-modal pre-trained model can better differentiate antonyms and discover more hypernyms or hyponyms, while text pre-training works better in finding synonyms. Moreover, multi-modal pre-trained model can construct a more extensive associative relationship between words.
    A Joint Learning Approach to Few-Shot Learning for Multi-category Sentiment Classification
    LI Zicheng, CHANG Xiaoqin, LI Yameng, LI Shoushan, ZHOU Guodong
    2023, 59(1):  57-64.  DOI: 10.13209/j.0479-8023.2022.068
    Asbtract ( )   HTML   PDF (788KB) ( )  
    Related Articles | Metrics
    Most few-shot learning approaches can’t get satisfactory results in fine-grained multi-category sentiment classification tasks. To solve this problem, a joint learning approach is proposed to few-shot learning for multi-category sentiment classification. Specifically, we utilize the pre-trained token-replaced detection model as few-shot learners and concurrently reformulate fine-grained sentiment classification tasks as both classification and regression problems by appending classification and regression templates and label description words to the input at the same time. For joint learning, several fusion methods are proposed to fuse the classification prediction and regression prediction. Experimental results show that, compared to mainstream few-shot methods, the proposed approach apparently achieves better performances in F1-Score and accuracy rate.
    Medical Entity Relation Extraction Based on Pre-trained Model and Hybrid Neural Network
    ZHAO Dandan, ZHANG Junpeng, MENG Jiana, ZHANG Zhihao, SU Wen
    2023, 59(1):  65-75.  DOI: 10.13209/j.0479-8023.2022.065
    Asbtract ( )   HTML   PDF (733KB) ( )  
    Related Articles | Metrics
    Medical text has high entity density and verbose sentence structure, which makes the simple neural network methods unable to capture its semantic features. Therefore, a hybrid neural network method based on pre-trained model is proposed. Firstly, a pre-trained model is used to obtain the dynamic word vector and the entity tagging features are extracted. Secondly, the contextual features of the medical text are obtained through a bidirectional long and short-term memory network. Simultaneously, the local features of the text are obtained using the convolutional neural network. Then the global semantic features of the text are obtained by weighting the sequence features through the attention mechanism. Finally, the entity tagging features are fused with the global semantic features and the extraction results are obtained through the classifier. The experimental results of entity relation extraction on the medical domain dataset show that the performance of the proposed hybrid neural network model is improved compared with the mainstream models, which indicates that this multi-feature fusion method can improve the effect of entity relation extraction. 
    Exploration of Knowledge Driven Event Hyperbolic Embedding Temporal Relation Extraction Method
    DUAN Jianyong, DAI Shiwei, WANG Hao, HE Li, LI Xin
    2023, 59(1):  76-82.  DOI: 10.13209/j.0479-8023.2022.066
    Asbtract ( )   HTML   PDF (542KB) ( )  
    Related Articles | Metrics
    Aiming at the problem of asymmetric temporal relations of events, the event representation is mapped to hyperbolic space to extract temporal relations of events. The word embedding representation of the event is constructed by using the pre-trained word vector and external knowledge through simple operation. Experimental results on publicly released datasets show that the F1 value of the model is generally 2% higher than that of the baseline model, which can improve the effect of event temporal relation extraction.
    Multi-turn Event Argument Extraction Based on Role Information Guidance
    YU Yuanfang, ZHANG Yong, ZUO Haoyang, ZHANG Lianfa, WANG Tingting
    2023, 59(1):  83-91.  DOI: 10.13209/j.0479-8023.2022.064
    Asbtract ( )   HTML   PDF (685KB) ( )  
    Related Articles | Metrics
    Aiming at the two problems of insufficient utilization of role information and lack of interaction between arguments in general domain event argument extraction research, a role information-oriented multi-turn event argument extraction model is proposed to enhance the semantic information of texts and interactions between arguments. The interactive capability can improve the performance of event argument extraction. First, to better utilize role knowledge to guide argument extraction, the model builds role knowledge based on role definitions, independently encodes role information and text, and uses a method based on attention mechanism to obtain label-knowledge-enhanced representations. Then the augmented embeddings are used to predict whether or not each token is a start or end position for some category. At the same time, in order to make full use of the interaction between event arguments in the extraction process, inspired by the multi-turn dialogue model, this paper designs a multi-turn event argument extraction algorithm. The algorithm refers to the natural logic of “easiness to hardness”, and selects the character with the highest prediction probability, that is, the most predictable character, for extraction each time. In the process of argument extraction, in order to model the interaction between arguments, the model introduces historical embedding, and updates the historical embedding after each prediction to help the extraction of the next round of event arguments. The experimental results show that the guidance of role information and multi round extraction algorithm effectively improve the performance of argument extraction, and the method achieves state-of-the-art performance.
    Microbial Nitrogen Transformation Key Processes and Its Influencing Factors in Water and Sediment of Xining Section of the Huangshui River
    SHANG Yunyi, LI Zhilong, SUN Liyu, CHEN Qian
    2023, 59(1):  92-104.  DOI: 10.13209/j.0479-8023.2022.122
    Asbtract ( )   HTML   PDF (4042KB) ( )  
    Related Articles | Metrics
    A total of 58 water and sediment samples were collected in the typical sampling section of the Xining section of the Huangshui River during the wet season (July 2018) and the dry season (April 2019), respectively, and 6 wastewater samples were directly taken from the effluents of wastewater treatment plants (WWTPs) in the dry season. A total of 12 nitrogen functional genes were quantitatively analyzed by real-time fluorescence quanti-tative PCR technology (qPCR). The results showed that the average of total nitrogen (TN) concentration in Huangshui River was 3.06±1.23 (1.308–6.51) mg/L. The nitrogen functional genes with high relative abundance in water and sediments were narG, nirS and nosZ. There were significant seasonal differences in the abundance and composition of nitrogen functional genes in sediments rather than those in water. The key nitrogen transformation process was denitrification, and its average contribution to nitrogen removal in water and sediment were 88% and 98%, respectively. The nitrogen transformation process in water was mainly impacted by pH, TN and NO3--N, the ammonia oxidation process was negatively correlated with the NO3--N concentration, and the denitrification process was negatively correlated with the pH. The nitrogen transformation process in sediment was mainly related to the water nitrogen concentration, sediment pH, TN, total phosphorus and organic carbon concentration, etc., the ammonia oxidation process was negatively related to the water nitrogen concentration, and the denitrification process was related to the sediment properties. Further analysis showed that the wastewater treatment plant effluents significantly reduced the abundance of genes such as AOA-amoA, CMX-amoA, nirS, nxrB, napA, narG, hzsA in the received water, which might lead to the limitation of the denitrification, anammox, nitrification and comammox processes, but at the same time elevated the abundance of nrfA genes and increased the contribution rate of dissimilatory nitrate reduction to ammonium (DNRA) to nitrogen removal in water. The relative abundances, such as AOA-amoA, nxrB and CMX-amoA, were significantly reduced in the sediments in the affected river segments, and nitrification and comammox processes in sediments were suppressed. This study can provide a scientific basis for nitrogen pollution control in the Xining section of Huangshui River. 
    Research on Refined Simulation of Waterlogging Process in Urban Overpass Tunnels
    YE Yujia, QIN Huapeng, MAO Junqi
    2023, 59(1):  105-114.  DOI: 10.13209/j.0479-8023.2022.105
    Asbtract ( )   HTML   PDF (4518KB) ( )  
    Related Articles | Metrics
    To scientifically guide the prevention transformation project of drainage and waterlogging in urban overpass and reduce the risk of waterlogging, a high-resolution urban waterlogging model has been constructed using an overpass in Shenzhen as the research object, and the model was calibrated and verified by using measured surface water data to simulate flow accumulation in the overpass tunnel under different design rainfall before and after the reconstruction of the rainwater outlet anti-blocking, the sunken greenbelt, and a joint measure of these two infrastructures. The results show that 1) the high-resolution urban waterlogging model can accurately simulate the dynamic change process of flow accumulation in urban overpass. 2) Under 5-year, 20-year, and 100-year return period of rainfall, the overpass tunnel displays serious waterlogging, while the temporal and spatial change rate of flow accumulation is rapid. Under the 100-year return period of rainfall, the maximum accumulated inundation depth and area are as high as 1.52 m and 1833 m2, respectively. The maximum growth rate of inundation depth over time and space are as high as 0.04 m/min and 0.23 m/10 m, respectively. 3) The waterlogging reduction of combined measure is more efficient than that of various renovation measures used separately. Under the rainfall of 100-year return period, there is still ponding with a depth of over 0.6 m when the measures of rainwater outlet anti-blocking or sunken greenbelt are taken alone. With combined measures, the inundation depth could be lower than 0.5 m, and the waterlogging duration could be shortened to less than 30 min, which can effectively alleviate the waterlogging of overpass tunnels.
    Synthesis of PHB by Recombinant E. coli Using Acetic Acid and Lactic Acid as Carbon Sources
    XIAO Meng, JIANG Ying, CUI Yixuan, Sadaf Riaz, Maurycy Daroch
    2023, 59(1):  115-124.  DOI: 10.13209/j.0479-8023.2022.100
    Asbtract ( )   HTML   PDF (7573KB) ( )  
    Related Articles | Metrics
    Two cheap byproducts lactic and acetic acid were directly used as the carbon sources of polyhy-droxybutyrate (PHB)-producing recombinant Escherichia coli to investigate the effects of acetic acid and lactic acid on the growth and PHB yield of recombinant E. coli. The PHB synthetic operon phaCAB gene cluster constructed on the basis of Cupriavidus necator genes was cloned into pBAD vector to obtain PHB-producing recombinant strain BL21_pBAD_phaCAB, which was expressed in E. coli by using arabinose as an inducer. The recombinant strain was cultured using LB and M9 medium respectively, and the growth rate and PHB yield were studied to lead the exploration of optimum medium for PHB-producing recombinant E. coli. To explore the effect of lactic acid and acetic acid on the growth and PHB yield of the recombinant strain, 0.04 g/L lactic acid, 1.2 g/L lactic acid, 0.02 g/L acetic acid, 0.6 g/L acetic acid, 0.04 g/L lactic acid + 0.02 g/L acetic acid, 1.2 g/L lactic acid + 0.4 g/L acetic acid were added into M9 medium (containing 2 g/L glucose) as the experimental group where M9 medium (containing 2 g/L glucose) was the control group. The culture media of the 6, 12, 24, and 36 hours were taken respectively to explore the concentration changes in glucose, acetic acid, and lactic acid. The results show that the low nitrogen M9 medium is more suitable for the growth of PHB-producing recombinant E. coli in a low sugar culture environment. After the consumption of glucose, E. coli can use acetic acid and lactic acid as carbon sources for metabolism, and by adding a proper amount of acetic acid and lactic acid, the PHB yield of the recombinant strain can get improved. When 1.2 g/L lactic acid is added, an up to 1.43 g/L PHB yield can be achieved which is 78% higher than the control. 
    Characteristics of Phages in Water and Sediments of Hanjiang River
    SUN Liyu, SHANG Yunyi, LI Zhilong, XUE Zehuan, LIU Tang
    2023, 59(1):  125-132.  DOI: 10.13209/j.0479-8023.2022.124
    Asbtract ( )   HTML   PDF (5328KB) ( )  
    Related Articles | Metrics
    Based on the water and sediment monitoring samples of 6 sections in the middle and lower reaches of the Hanjiang River in March and October 2014, 384 high-quality vOTUs (viral Operational Taxonomic Units) were obtained. The research showed that more than 95% of the vOTUs belonged to Caudovirales, and the top three abundance families were Myoviridae, Siphoviridae and Podoviridae. The analysis of PCoA (principal coordinates analysis) and ANOSIM (analysis of similarities) showed that the phage community structure in sediments was relatively stable, while the phage community in water was easy to change with seasons, and the phage community in water and sediments of the same basin might be connected. Hosts came from 19 phyla of 2 domains (kingdoms), and the most abundant host was Proteobacteria. 88% of the vOTUs had a single host of phylum level, while three vOTUs exhibited a broad host range across five phyla. Phages infecting hosts across domain (kingdoms) would infect Thermoproteota, and most commonly infected Bacteroidota. Compared with water, the community composition of phage hosts in sediments was more diverse. Pearson correlation analysis showed that the composition of pahges agreed with that of their microbial hosts at the phylum level. 
    Microscopic Occurrence Types of Jimsar Continental Shale Oil
    DING Zhenhua, SHI Xiang, SONG Ping, SHI Weifeng, ZHANG Jigang, LI Xutao, SHI Yongmin, LI Wei
    2023, 59(1):  133-142.  DOI: 10.13209/j.0479-8023.2022.114
    Asbtract ( )   HTML   PDF (47214KB) ( )  
    Related Articles | Metrics
    In order to clarify the microscopic occurrence state and type of shale oil, and to avoid the influence of water drilling sampling on the distribution of crude oil during the experiment, the typical oil-bearing core of Jimsar shale was selected. The whole process of liquid nitrogen drilling, cutting and grinding was adopted to obtain the microscopic reservoir mineral type and structure, reservoir space type, morphology, and the distribution of C, O, Si, Al, Ca, K, Na, Mg and other elements at the micro- and nano-scale by the combination of full energy spectrum scanning electron microscope, secondary electron imaging and backscattering technology. The enrichment degree of crude oil at the micro- and nano-scale was determined according to the distribution of C element content after removing mineral factors. Combining the mineral rock fabric and reservoir space morphology, the micro- and nano-scale crude oil occurrence states in shale oil was quantitatively characterized. According to the configuration relationship between crude oil and pore-throat, the microscopic pore-throat structure of the reservoir and the occurrence state of crude oil were characterized, and the type of occurrence of crude oil was divided and clarified. Four types of microscopic crude oil occurrence types in the study area were summarized: movable oil in micron-scale macro-pores dissolved by dolomite, movable oil in micro-nano-scale pores between sand particles, semi-bound oil film adsorbed by pore throat walls between sand particles, and authigenic clay minerals bound oil within the intercrystalline pores. 
    Geochemical Features and Tectonic Significance of Late Archean Metavolcanic Rocks in Hengshan Area, North China Craton
    GAO Shansong, LI Qiugen, HU Pengyue, Yasin Rahim, LI Hongying
    2023, 59(1):  143-160.  DOI: 10.13209/j.0479-8023.2022.102
    Asbtract ( )   HTML   PDF (12920KB) ( )  
    Related Articles | Metrics
    Petrographic, geochemical, zircon U-Pb geochronogical and Hf isotopic data are presented for the exposed metavolcanic rocks in Hengshan region to determine their eruption age, petrogenesis and geodynamic significance in the Trans-North China Orogen, North China Craton. Zircon U-Pb dating of the rocks yielded a weighted mean 207Pb/206Pb age of 2508 ± 20 Ma, indicating that their protoliths were erupted in the late Neoarchean. Geochemically, the Hengshan metavolcanic rocks are composed of basalt and basaltic andesite and marked by variable SiO2 (45.51%–62.67%), FeOt (4.43%–15.72%) and MgO (3.75%–8.14%) contents, indicative of product of derivation from a mantle source with subsequent fractional crystallization of clinopyroxene, hornblende and magnetite. These metavolcanic rocks are characterized by enrichments in light rare earth elements (LREE) and large-ion lithophile elements (LILE), as well as depletions in heavy rare earth elements (HREE) and high-field-strength elements (HFSE), with relatively high Th content and Th/Yb values, similar to those of the typical calc-alkaline arc-like volcanic rocks. Such arc-like geochemical signatures, together with some incompatible elemental ratios (e.g., Nb/Yb, Zr/Yb and (Nb/La)N), reveal their origination from a sub-arc enriched mantle wedge variably metasomatized by the slab-derived dehydration fluids. Combined with the region geological background, these Hengshan metavolcanic rocks were developed on a continental arc setting in the late Neoarchean.
    Empathy Differences between the Elderly and the Young: Discrepancies in Positive and Negative Emotions
    PANG Fangfang, CHEN Wei, SU Ying, GUAN Ruiyuan
    2023, 59(1):  161-169.  DOI: 10.13209/j.0479-8023.2022.099
    Asbtract ( )   HTML   PDF (462KB) ( )  
    Related Articles | Metrics
    The authors attempted to compare the empathy differences between the elderly and the young, by using a combined method of self-assessment questionnaire and behavioral task. In study 1, 280 elderly and 304 young adults were recruited as participants, completed Interpersonal Reactivity Index Scale (IRI), a self-assessment questionnaire on empathy. In study 2, the Multifaceted Empathy Test (MET), a behavioral task, was used in 71 older adults and 74 younger adults to compare their differences on empathy. The results showed that: 1) compared with the young group, the cognitive empathy of the elderly decreased significantly according to both the self-assessment questionnaire or performance in the behavioral task; 2) compared with the young group, the emotional empathy of the elderly decreased significantly for negative emotional stimuli; 3) within the group of elderly, the intensity of emotional empathy for positive emotion pictures was markedly higher than that for negative emotions, while no significant difference was found in the young group. These results suggest that empathy for positive and negative emotions develops separately from youth to old age. 
    Effects of Self-Focus on External Attention and State Anxiety in Social Anxiety: Evidence from Eye-Movement and Physiological Measures
    CHEN Huijing, LIN Muyu, QIAN Mingyi
    2023, 59(1):  170-178.  DOI: 10.13209/j.0479-8023.2022.117
    Asbtract ( )   HTML   PDF (612KB) ( )  
    Related Articles | Metrics
    In order to investigate the effects of self-focus on attention towards external social feedback and state anxiety of socially anxious individuals in a simulated real social situation, 105 participants were recruited and gave an impromptu speech with pre-recorded audience showing positive, neutral and negative feedback. Self-focus was manipulated through instructions. Eye movements served as indicators of attention toward audience with difference emotional valence. Skin conductance level and heart rate were measured. The results showed that self-focus reduced attention to external social feedback among both high and low socially anxious groups. In high self-focus condition, both high and low socially anxious groups exhibited higher heart rate, compared with low self-focus condition. The findings indicate that self-focus impairs processing of external stimuli in socially anxious individuals.