Content – Aim of the course
Data science is quite a modern term, which has emerged from previous ones like Knowledge Discovery in Databases or Data Mining and deals with search, understanding and utilization of big data. Considering that biotechnology can be defined as “any technological application that uses biological systems, live organisms and their products in order to develop products or processes for specific uses”, the technological elements that it utilizes in its mission (mathematics, statistics, information technology, data availability etc.) evolve and one of them is data science. Moreover, since all the companies in the biotechnology industry rely on data and on the information technologies, a scientist in this domain must hold competencies that deal with data science.
Moreover, since biotechnologists are researchers that apply statistics in biology, they are data scientists too. Both biotechnologists and data scientists are experts in research design (experimental, pre-experimental and quasi-experimental) and in this respect they are familiar with the triplet: mathematics, statistics (biostatistics) and programming. They collect vast amounts of data from the dynamic systems of the molecular world, analyze them in detail so that they will be able to define the factors that they will possibly require computational power. As such, biotechnologists learn to use software tools like R and Python, and to collect and analyze data from databases, which according to recent studies from head-hunters (Glassdoor) will enable them to enter competitive areas in the labor market.
This course has a double objective: first to offer the theoretical background and the technical skills regarding data science and second, to make students understand how they can utilize data (i.e., biological data) and produce predictive models in biotechnology.