4.4 Evolution of Software Skills
Table 5 displays the percentages of job postings that required
software skills by year. The software skills are sorted by the
most popular software skills in 2018 within each category. Only
significant trends (i.e., those software skills that significantly
increased or decreased from 2014 to 2018) or software that
appeared in at least 10% of the job postings are displayed in
Table 5. Categories are underlined and subcategories are
indented. Bolded (italicized) software skills indicate a
significant increase (decrease) between 2014 and 2018. All
results are presented in Appendix B.
5. DISCUSSION
5.1 Discussion by Research Question
We will discuss the results in terms of the research questions.
Although the U.S. state analysis was not a research question, it
is important to know how data analyst jobs are distributed by
U.S. state. According to Table 2, Virginia increased from
9.36% of U.S. job postings in 2014 to 18% of U.S. job postings
in 2018. Virginia is the leading data center market in the U.S.
and has the 3rd-highest concentration of high-tech workers in
the nation. Virginia is “preparing for future growth for IT
companies through its top-ranked higher education system to
build a pipeline of technology talent” (Key Industries, n.d.). The
number of job postings in Texas increased gradually each year
as well. Several data centers were established in Texas recently,
such as Microsoft and RackSpace. These data centers are
developing rapidly, contributing to the increasing job demand
in Texas (Mosbrucker, 2018). It is also interesting to note that
several states showed declining percentages of job postings
from 2014 to 2018. For example, Ohio, Washington, Maryland,
and Georgia all show statistically significant decline over time,
although not a large number in magnitude.
The first research question is, “What data analyst job skills
and knowledge have remained steady from 2014 to
2018?” According to Table 4, general statistics has remained a
steady and highly desired skill over the time period studied. In
terms of software skills (see Table 5), Personal Productivity
Software (e.g., Visio, JIRA), Microsoft Office (not including
Access), and Oracle have remained steady and highly desired
during the time frame of the study. Other software or languages
that has remained steady, although not appearing in a large
percentage of job postings, include XML, Teradata, DB2,
MySQL, Linux, Visual Basic, and HTML. These general
domain skills and software skills have been steady for the past
several years and have been documented in other studies (e.g.,
Gallivan, Truex, and Kvasny, 2004; Luo, 2016).
The second research question is, “What data analyst job
skills were popular in the past, but are less attractive
now?” None of the general domain skills (see Table 4) showed
any decline from 2014-2018. In terms of software skills (see
Table 5), only Microsoft Access (p < 0.01), Cognos (p < 0.05),
and SAP (p < 0.05) showed statistically significant decline. In
terms of Microsoft Access, this may be due to direct
competition from other growing, open-source database
software like MySQL. There is some anecdotal evidence that
Microsoft Access customer support threads have been declining
(Microsoft, 2017). Cognos is an IBM business intelligence suite
that provides a toolset for reporting, analytics, scorecarding,
and monitoring of events and metrics. Although showing
decline, the suite is still solid at 6.00% of job postings in 2018.
SAP is an enterprise-wide software that helps manage
operations and customers. SAP also has BI software including
the BI suite, SAP Lumira, Hana, and Crystal Reports. Again,
although SAP has shown a statistically significant decrease
from 2014 to 2018, it is still a sought-after software skill at 8.3%
of job postings in 2018. In summary, three software skills
showed statistically significant decline, but these three software
packages are still desired skills in the market.
The third research question is, “What data analyst job skills
are gaining attention in the current job market?” There are
numerous upward-trending general domain skills and software
skills over 2014-2018. First, we will examine the upward-
trending skills that are in the top 25% of all job postings. The
percentages indicate the number of job postings requiring that
skill in 2018. General statistics (28%, p < 0.001), modeling
(21%, p < 0.01), model development (26%, p < 0.001), data
management (50%, p < 0.001), database systems (59%,
p < 0.001), BI (23%, p < 0.001), programming languages (23%,
p < 0.001), and enterprise systems (21%, p < 0.001) all
increased significantly from 2014 to 2018 and are highly
desired skills. In terms of software skills or languages, SQL
server (18%, p < 0.001), Tableau (19%, p < 0.001), statistical
packages (16%, p < 0.001), SAS (10%, p < 0.001), R (12%, p
< 0.001), and Python (11%, p < 0.001) are all in the top quartile
in terms of job postings and show a statistically significant
increase.
The next set of general domain skills and software packages
are increasing in demand over time but represent the next
quartile (top 50% of job postings). These software packages
include SPSS (3%, p < 0.05), Hive (2%, p < 0.001), Salesforce
(5%, p < 0.05), Hadoop (5%, p < 0.001), and Microsoft Azure
(4%, p < 0.001). Although demand for these skills grew during
this time frame, they do not represent the top quartile in terms
of the number of entry-level job postings asking for that
skillset.
Lastly, the skills that grew between 2014 and 2018 but
represent the lower 50% of the total entry-level job postings
include NoSQL (1.6%, p < 0.01), Microsoft Power BI (1.9%, p
< 0.001), Apache Pig (1%, p < 0.01), and Google Analytics
(1.2%, p < 0.05). An interesting observation is the lack of job
postings that mention NoSQL. NoSQL is a non-relational
database that is scaled horizontally and means “not only SQL.”
Only a small fraction of job postings mentioned NoSQL, which
indicates that NoSQL has not increased in popularity as
previously predicted (Pal, 2016). This is useful information for
instructors of database courses. If time is limited, instructors
should focus on relational databases instead of NoSQL since it
is not highly demanded in the industry.
There are several other software packages that are not
trending up or down (i.e, remained steady from 2014-2018) but
only represent a very small fraction of entry-level job postings.
For example, MongoDB (0.3%), Apache HBase (0.6%),
Apache Cassandra (0.2%), Pentaho (0.3%), JavaScript
visualization library – D3 (0.2%), STATA (0.5%), Ruby
(0.3%), and IBM Watson (0.2%) only appear in a small fraction
of job postings. Some of these software programs are taught in
database, analytics, and BI courses and are widely known in the
industry. However, the results of this research demonstrate that
they are not a widely needed skillset for entry-level data
analytics jobs. Therefore, given the time constraints of a course,
Journal of Information Systems Education, Vol. 31(4) Fall 2020