List of articles (by subject) Semantic Web


    • Open Access Article

      1 - Referral Traffic Analysis: A Case Study of the Iranian Students' News Agency (ISNA)
      Roya Hassanian Esfahani Mohammad Javad Kargar
      Web traffic analysis is a well-known e-marketing activity. Today most of the news agencies have entered the web providing a variety of online services to their customers. The number of online news consumers is also increasing dramatically all over the world. A news webs More
      Web traffic analysis is a well-known e-marketing activity. Today most of the news agencies have entered the web providing a variety of online services to their customers. The number of online news consumers is also increasing dramatically all over the world. A news website usually benefits from different acquisition channels including organic search services, paid search services, referral links, direct hits, links from online social media, and e-mails. This article presents the results of an empirical study of analyzing referral traffic of a news website through data mining techniques. Main methods include correlation analysis, outlier detection, clustering, and model performance evaluation. The results decline any significant relationship between the amount of referral traffic coming from a referrer website and the website's popularity state. Furthermore, the referrer websites of the study fit into three clusters applying K-means Squared Euclidean Distance clustering algorithm. Performance evaluations assure the significance of the model. Also, among detected clusters, the most populated one has labeled as "Automatic News Aggregator Websites" by the experts. The findings of the study help to have a better understanding of the different referring behaviors, which form around 15% of the overall traffic of Iranian Students' News Agency (ISNA) website. They are also helpful to develop more efficient online marketing plans, business alliances, and corporate strategies. Manuscript profile
    • Open Access Article

      2 - Computing Semantic Similarity of Documents Based on Semantic Tensors
      Navid Bahrami Amir H.  Jadidinejad Mojdeh Nazari
      Exploiting semantic content of texts due to its wide range of applications such as finding related documents to a query, document classification and computing semantic similarity of documents has always been an important and challenging issue in Natural Language Process More
      Exploiting semantic content of texts due to its wide range of applications such as finding related documents to a query, document classification and computing semantic similarity of documents has always been an important and challenging issue in Natural Language Processing. In this paper, using Wikipedia corpus and organizing it by three-dimensional tensor structure, a novel corpus-based approach for computing semantic similarity of texts is proposed. For this purpose, first the semantic vector of available words in documents are obtained from the vector space derived from available words in Wikipedia articles, then the semantic vector of documents is formed according to their words vector. Consequently, measuring the semantic similarity of documents can be done by comparing their semantic vectors. The vector space of the corpus of Wikipedia will cause the curse of dimensionality challenge because of the existence of the high-dimension vectors. Usually vectors in high-dimension space are very similar to each other; in this way, it would be meaningless and vain to identify the most appropriate semantic vector for the words. Therefore, the proposed approach tries to improve the effect of the curse of dimensionality by reducing the vector space dimensions through random indexing. Moreover, the random indexing makes significant improvement in memory consumption of the proposed approach by reducing the vector space dimensions. The addressing capability of synonymous and polysemous words in the proposed approach will be feasible by means of the structured co-occurrence through random indexing. Manuscript profile
    • Open Access Article

      3 - Scalable Community Detection through Content and Link Analysis in Social Networks
      Zahra  Arefian Mohammad Reza  Khayyam Bashi
      Social network analysis is an important problem that has been attracting a great deal of attention in recent years. Such networks provide users many different applications and features; as a result, they have been mentioned as the most important event of recent decades. More
      Social network analysis is an important problem that has been attracting a great deal of attention in recent years. Such networks provide users many different applications and features; as a result, they have been mentioned as the most important event of recent decades. Using features that are available in the social networks, first discovering a complete and comprehensive communication should be done. Many methods have been proposed to explore the community, which are community detections through link analysis and nodes content. Most of the research exploring the social communication network only focuses on the one method, while attention to only one of the methods would be a confusion and incomplete exploration. Community detections is generally associated with graph clustering, most clustering methods rely on analyzing links, and no attention to regarding the content that improves the clustering quality. In this paper, to scalable community detections, an integral algorithm is proposed to cluster graphs according to link structure and nodes content, and it aims finding clusters in the groups with similar features. To implement the Integral Algorithm, first a graph is weighted by the algorithm according to the node content, and then network graph is analyzed using Markov Clustering Algorithm, in other word, strong relationships are distinguished from weak ones. Markov Clustering Algorithm is proposed as a Multi-Level one to be scalable. The proposed Integral Algorithm was tested on real datasets, and the effectiveness of the proposed method is evaluated. Manuscript profile
    • Open Access Article

      4 - Analysis of expert finding algorithms in social network in order to rank the top algorithms
      AhmadAgha kardan Behnam Bozorgi
      The ubiquity of Internet and social networks have turned question and answer communities into an environment suitable for users to ask their questions about anything or to share their knowledge by providing answers to other users’ questions. These communities designed f More
      The ubiquity of Internet and social networks have turned question and answer communities into an environment suitable for users to ask their questions about anything or to share their knowledge by providing answers to other users’ questions. These communities designed for knowledge-sharing aim to improve user knowledge, making it imperative to have a mechanism that can evaluate users’ knowledge level or in other words “to find experts”. There is a need for expert-finding algorithms in social networks or any other knowledge sharing environment like question and answer communities. There are various content analysis and link analysis methods for expert-finding in social networks. This paper aims to challenge four algorithms by applying them to our dataset and analyze the results in order to compare the algorithms. The algorithms suitable for expert finding has been found and ranked. Based on the results and tests it is concluded that the Z-score algorithm has a better performance than others. Manuscript profile
    • Open Access Article

      5 - De-lurking in Online Communities Using Repost Behavior Prediction Method
      Omid Reza Bolouki Speily
      Nowadays, with the advent of social networks, a big change has occurred in the structure of web-based services. Online community (OC) enable their users to access different type of Information, through the internet based structure anywhere any time. OC services are am More
      Nowadays, with the advent of social networks, a big change has occurred in the structure of web-based services. Online community (OC) enable their users to access different type of Information, through the internet based structure anywhere any time. OC services are among the strategies used for production and repost of information by users interested in a specific area. In this respect, users become members in a particular domain at will and begin posting. Considering the networking structure, one of the major challenges these groups face is the lack of reposting behavior. Most users of these systems take up a lurking position toward the posts in the forum. De-lurking is a type of social media behavior where a user breaks an "online silence" or habit of passive thread viewing to engage in a virtual conversation. One of the proposed ways to improve De-Lurking is the selection and display of influential posts for each individual. Influential posts are so selected as to be more likely reposted by users based on each user's interests, knowledge and characteristics. The present article intends to introduce a new method for selecting k influential posts to ensure increased repost of information. In terms of participation in OCs, users are divided into two groups of posters and lurkers. Some solutions are proposed to encourage lurking users to participate in reposting the contents. Based on actual data from Twitter and actual blogs with respect to reposts, the assessments indicate the effectiveness of the proposed method. Manuscript profile
    • Open Access Article

      6 - A Semantic Approach to Person Profile Extraction from Farsi Web Documents
      Hojjat Emami Hossein Shirazi ahmad abdolahzade
      Entity profiling (EP) as an important task of Web mining and information extraction (IE) is the process of extracting entities in question and their related information from given text resources. From computational viewpoint, the Farsi language is one of the less-studie More
      Entity profiling (EP) as an important task of Web mining and information extraction (IE) is the process of extracting entities in question and their related information from given text resources. From computational viewpoint, the Farsi language is one of the less-studied and less-resourced languages, and suffers from the lack of high quality language processing tools. This problem emphasizes the necessity of developing Farsi text processing systems. As an element of EP research, we present a semantic approach to extract profile of person entities from Farsi Web documents. Our approach includes three major components: (i) pre-processing, (ii) semantic analysis and (iii) attribute extraction. First, our system takes as input the raw text, and annotates the text using existing pre-processing tools. In semantic analysis stage, we analyze the pre-processed text syntactically and semantically and enrich the local processed information with semantic information obtained from a distant knowledge base. We then use a semantic rule-based approach to extract the related information of the persons in question. We show the effectiveness of our approach by testing it on a small Farsi corpus. The experimental results are encouraging and show that the proposed method outperforms baseline methods. Manuscript profile
    • Open Access Article

      7 - Coreference Resolution Using Verbs Knowledge
      hasan zafari maryam hourali Heshaam Faili
      Coreference resolution is the problem of determining which mention in a text refer to the same entities, and is a crucial and difficult step in every natural language processing task. Despite the efforts that have been made in the past to solve this problem, its perform More
      Coreference resolution is the problem of determining which mention in a text refer to the same entities, and is a crucial and difficult step in every natural language processing task. Despite the efforts that have been made in the past to solve this problem, its performance still does not meet today’s applications requirements. Given the importance of the verbs in sentences, in this work we tried to incorporate three types of their information on coreference resolution problem, namely, selectional restriction of verbs on their arguments, semantic relation between verb pairs, and the truth that arguments of a verb cannot be coreferent of each other. As a needed resource for supporting our model, we generate a repository of semantic relations between verb pairs automatically using Distributional Memory (DM), a state-of-the-art framework for distributional semantics. This resource consists of pairs of verbs associated with their probable arguments, their role mapping, and significance scores based on our measures. Our proposed model for coreference resolution encodes verbs’ knowledge with Markov logic network rules on top of deterministic Stanford coreference resolution system. Experiment results show that this semantic layer can improve the recall of the Stanford system while preserves its precision and improves it slightly. Manuscript profile
    • Open Access Article

      8 - Effective Query Recommendation with Medoid-based Clustering using a Combination of Query, Click and Result Features
      Elham Esmaeeli-Gohari Sajjad Zarifzadeh
      Query recommendation is now an inseparable part of web search engines. The goal of query recommendation is to help users find their intended information by suggesting similar queries that better reflect their information needs. The existing approaches often consider the More
      Query recommendation is now an inseparable part of web search engines. The goal of query recommendation is to help users find their intended information by suggesting similar queries that better reflect their information needs. The existing approaches often consider the similarity between queries from one aspect (e.g., similarity with respect to query text or search result) and do not take into account different lexical, syntactic and semantic templates exist in relevant queries. In this paper, we propose a novel query recommendation method that uses a comprehensive set of features to find similar queries. We combine query text and search result features with bipartite graph modeling of user clicks to measure the similarity between queries. Our method is composed of two separate offline (training) and online (test) phases. In the offline phase, it employs an efficient k-medoids algorithm to cluster queries with a tolerable processing and memory overhead. In the online phase, we devise a randomized nearest neighbor algorithm for identifying most similar queries with a low response-time. Our evaluation results on two separate datasets from AOL and Parsijoo search engines show the superiority of the proposed method in improving the precision of query recommendation, e.g., by more than 20% in terms of p@10, compared with some well-known algorithms. Manuscript profile
    • Open Access Article

      9 - Evaluating the Cultural Anthropology of Artefacts of Computer Mediated Communication: A Case of Law Enforcement Agencies
      Chukwunonso Henry Nwokoye Njideka N. Mbeledogu Chikwe Umeugoji
      The renowned orientations of cultural models proposed by Hall and Hofstede has been the subject of criticisms. This is due to the weak, inflexible and old-fashioned nature of some designs resulting from them. In addition, is the ever-changing, formless and undefined nat More
      The renowned orientations of cultural models proposed by Hall and Hofstede has been the subject of criticisms. This is due to the weak, inflexible and old-fashioned nature of some designs resulting from them. In addition, is the ever-changing, formless and undefined nature of culture and globalization. Consequently, these vituperations have resulted in better clarifications when assessing the cultural anthropology of websites. Based on these later clarifications and other additions, we seek to evaluate the cultural heuristics of websites owned by agencies of the Nigerian government. Note that this is verily necessary because older models did not include Africa in their analyses. Specifically, we employed the online survey method by distributing questionnaires to different groups of experts drawn from the various regions of Nigeria. The experts employed methods such as manual inspection and use of automated tools to reach conclusions. Afterwards, the results were assembled and using the choice of a simple majority, we decided whether a design parameter is either high or low context. Findings show that websites developers tend to favor low context styles when choosing design parameters. The paper attempts to situate Africa in Hall’s continuum; therein, Nigeria (Africa) may fall within French Canadian and Scandinavian and/or within Latin and Scandinavian for the left hand and right hand side diagram respectively. In future, we would study the cultural anthropology of African websites employing the design parameters proposed by Alexander, et al. Manuscript profile
    • Open Access Article

      10 - A Customized Web Spider for Why-QA Pairs Corpus Preparation
      Manvi  Breja
      Considering the growth of researches on improving the performance of non-factoid question answering system, there is a need of an open-domain non-factoid dataset. There are some datasets available for non-factoid and even how-type questions but no appropriate dataset av More
      Considering the growth of researches on improving the performance of non-factoid question answering system, there is a need of an open-domain non-factoid dataset. There are some datasets available for non-factoid and even how-type questions but no appropriate dataset available which comprises only open-domain why-type questions that can cover all range of questions format. Why-questions play a significant role and are usually asked in every domain. They are more complex and difficult to get automatically answered by the system as why-questions seek reasoning for the task involved. They are prevalent and asked in curiosity by real users and thus their answering depends on the users’ need, knowledge, context and their experience. The paper develops a customized web crawler for gathering a set of why-questions from five popular question answering websites viz. Answers.com, Yahoo! Answers, Suzan Verberne’s open-source dataset, Quora and Ask.com available on Web irrespective of any domain. Along with the questions, their category, document title and appropriate answer candidates are also maintained in the dataset. With this, distribution of why-questions according to their type and category are illustrated. To the best of our knowledge, it is the first large enough dataset of 2000 open-domain why-questions with their relevant answers that will further help in stimulating researches focusing to improve the performance of non-factoid type why-QAS. Manuscript profile