A new paper is accepted in the Complexity journal, where the authors are: Carlos Mazano, Claudio Meneses, Paul Leger (https://doi.org/10.1155/2021/6760920 – To appear). Here is the abstract:
Malware is a sophisticated, malicious, and sometimes unidentifiable application on the network. The classifying network traffic method using machine learning shows to perform well in detecting malware. In the literature, it is reported that this good performance can depend on a reduced set of network features. This study presents an empirical evaluation
of two statistical methods of reduction and selection of features in an Android network traffic dataset using six supervised algorithms: Naïve Bayes, Support Vector Machine, Multilayer Perceptron Neural Network, Decision Tree, Random Forest, and K-Nearest Neighbors. The Principal Component Analysis (PCA) and Logistic Regression (LR) methods with p-value were applied to select the most representative features related to the time properties of flows and features of bidirectional packets. The selected features were used to train the algorithms using binary and multiclass classification. For performance evaluation and comparison metrics, precision, recall, F-measure, accuracy, and area under the curve (AUC-ROC) were used. The empirical results show that Random Forest obtains an average accuracy of 96\% and an AUC-ROC of 0.98 in binary classification. For the case of multiclass classification, again Random Forest achieves an average accuracy of 87\% and an AUC-ROC over 95\%, exhibiting better performance than the other machine learning algorithms. In both experiments, the 13 most representative features of a mixed set of flow time properties and bidirectional network packets selected by LR were used. In the case of the other five classifiers, their results in terms of precision, recall, and accuracy, are competitive with those obtained in related works, which used a greater number of input features. Therefore, it is empirically evidenced that the proposed method for the selection of features, based on statistical techniques of reduction and extraction of attributes, allows improving the identification performance of malware traffic, discriminating it from the benign traffic of Android applications.
A new paper has been accepted in Journal of Simulation, where the authors: Oswaldo Téran, Paul Leger, Manuela López (https://doi.org/10.1155/2021/6760920) . Here is the abstract:
Chinese cross-border e-commerce has become the largest in the world, overtaking US e-commerce and representing about 40% of total global e-commerce spending in 2018. This market is highly complex, uncertain, and poorly understood. Surveys and statistics have been used to characterise it, but new approaches are required to better understand its complexity. To address this gap, we present an agent-based model of Chinese cross-border e-commerce.
For a realistic representation of the buyers’ decision-making mechanism and some elements of their communication, including word of mouth (WOM), we use endorsements theory, and a survey is used to specify the model. The aim of the study is twofold: (1) to present an agent based simulation (ABS) model of the Chinese cross-border e-commerce market; and (2) to illustrate the potential of the model to explore future possible configurations of the market and to guide stakeholders’ decision making.
The application of Artificial Neural Networks (ANNs) to different domains become stronger everyday. In this accepted paper, we used ANNs to Detect Fetal Alcohol Spectrum Disorder in Children. The paper was accepted in high impact journal named Applied Sciences (IF = 2.458).
Abstract: Fetal alcohol spectrum disorder (FASD) is an umbrella term for children’s conditions due to their mother having consumed alcohol during pregnancy. These conditions can be mild to severe, affecting the subject’s quality of life. An earlier diagnosis of FASD is crucial for an improved quality of life of children by allowing a better inclusion in the educational system. New trends in computer-based diagnosis to detect FASD include using Machine Learning (ML) tools to detect this syndrome. However, most of these studies rely on children’s images that can be invasive and costly. Therefore, this paper presents a study that focuses on evaluating an ANN to classify children with FASD using non-invasive and more accessible data. This data used comes from a battery of tests obtained from children, including psychometric, saccade eye movement, and diffusion tensor imaging (DTI). We study the different configurations of ANN with dense layers being the psychometric data that correctly perform the best with 75\% of the outcome. The other models include a feature layer, and we used it to predict FASD using every test individually. Model obtained obtained an accuracy of 88.46% (psycometric, 74.07% (Antisaccadic), 72.24% (Prosaccadic), 88% (Memory guide saccade) and, 75% (DTI). These results suggest that the ANN approach is a competitive and efficient methodology to detect FASD. These results are an improved from Zhang’s 2019 model which used the same data with less accuracy level.
A high impact article named “A Collaborative Method for Scoping Software Product Lines: a Case Study in a Small Software Company” was accepted in indexed journal Applied Science (IF = 2.458). This work was developed with members of Colombian and Chile.
Abstract: SPL scoping is the activity for bounding Software Product Lines (SPL), gathering heterogeneous knowledge from diverse sources. For achieving an agreement among different stakeholders, a commonalty scope must be understood and committed to. However, gathering this knowledge from stakeholders with individual interests is a complex task. This paper reports the experience of scoping the SPL of a small Colombian software company, applying and evaluating a collaborative method called CoMeS-SPL. The company was looking to develop a set of products from a product previously developed with great potential to be adapted and sold to different customers. From a collaborative relationship university–enterprise model, the research groups that developed CoMeS-SPL proposed to use it answering to the company needs for defining an organization-suitable reuse scope around its platform called CORA. Both parties joined in the scoping co-production of the first SPL of the company. This method implied that the company would perform new tasks and involve other roles different for those who are used to defining the scope of a single product. The company actors considered that they obtained a useful scope and perceived the collaboration as valuable because they shared different knowledge and perspectives. The researchers were able to provide feedback on their proposed model, identifying successes and aspects to improve. The experience allowed strengthening the ties of cooperation with the company, and new projects and consultancies are being carried out.
Diego Cortes and Domingo Pinto, undergraduate students that work on Pragmatics, published the paper “An Architecture for an Open Implementation of an Agent-Based Model for WOM Marketing Campaigns“. In this paper, Pragmatics’ professors like Paul Leger, Manuela López, and Ismael Figueroa also participated.
The paper “Which Monads Haskell Developers Use: An Exploratory Study” has been accepted in Science of Computer Programming, which is a specialized journal in programming languages and software engineering.
Monads are a mechanism for embedding and reasoning about notions of computation such as mutable state, I/O, exceptions, and many others. Even though monads are technically language-agnostic, they are mostly associated with the Haskell language. Indeed, one could argue that the use of monads is one of the defining characteristic of the Haskell language. In practical terms, monadic programming in Haskell relies on the standard mtl package library, which provides eight-core notions of computation: identity, error, list, state, reader, writer, RWS, and continuations. Despite their widespread use, we are not aware of any empirical investigations regarding which monads are the most used by developers. In this paper we present an empirical study that covers a snapshot of available packages in the Hackage repository—covering 85135 packages and more than five million Haskell files. To the best of our knowledge this is the first large-scale analysis of Hackage with regards to monads and their usage as dependencies. Our results show that around 30.8% of the packages depend on the mtl package, whereas only 1.2% depend on alternative, yet compatible implementations. Nevertheless, usage patterns for each specific monad remain similar both for mtl and alternatives. Finally, the state monad is by far the most popular one, although all of them are used. We also report on the distribution of packages that use mtl, regarding their category and stability level.
A paper in the “business for computing” area was accepted in the journal “Revista de Investigación Aplicada en Ciencias Empresariales” (Chilean Journal – LatinIndex).
The purpose of this article is to determine gaps presented by Small businesses in the Coquimbo region in the use of SIAs. Of the total number of companies surveyed (N =106), only 14% mention that they have a tailor-made SIA, only 8% of companies use a standard one. 52% of companies use excel as software for the analysis of their relevant information; which evidences a lack of professionalization of information management, especially if 19% of them control their information manually.
The main gaps identified in the organizational field are the fear of the unknown, along with resistance to change and low knowledge regarding the SIA concept. The gap related to the financial field is the lack or absence of monetary funds to implement SIAs that allow them to advance in their digital transformation.
The SIAs go hand in hand with the evolution, improvement and greater ordering in companies, so they must be encouraged and well used. Small businesses in the Coquimbo region, together with strengthening the competences of the human team, must advance in the use of their information in an efficient way to boost their productivity.
The paper “A Practical Methodology to Learn Computer Architecture, Assembly Language, and Operating System” was accepted and presented on the International Conference on Computer Supported Education (CSEDU), Prague, Czech Republic, 2020.
The paper abstract:
System-level details, such as assembly language and operating systems, are important to develop/debug embedded systems and analyze malware. Therefore it is recommended to teach every topic of these subjects. However, their learning cost has been significantly increased due to current system complexities. To solve this problem, several visualization techniques have been proposed to help students in their learning process. However, observing only the computer system behaviors may be insufficient to apply it to real systems due to the lack of practical experiences and a comprehensive understanding of system-level details. To address these issues, we propose a novel methodology where students implement a virtual machine instead of using existing ones. This virtual machine needs to execute binary programs that can be run on a real operating system. Through implementing this virtual machine, students improve by experience their understanding of computer architecture, assembly languages, instruction sets, and the role of operating systems. We also provide MMVM that is a virtual machine implementation reference and can execute the binary programs while showing the internal states of CPU (registers & flags) to users (students) to support their implementation. Finally, this paper reports the education results applying this methodology to 15 students that consist of 3rd-year students and 1st year of master students