4 data scientists share their Elsevier job experience
Ronald van Loon
CEO, Principal Analyst Intelligent World?Helping AI-Driven Companies Generating Success?Top10 AI-Data-IoT-Influencer
The world of data science is moving at a rate of knots, and we are seeing many young minds drawn towards it. Aspiring candidates want to work in the best analytics firms; they’re looking for opportunities to develop their careers and be an intrinsic part of the market.
One such firm that offers brilliant growth and learning potential is Elsevier, which uses the power of analytics and data to enrich, discover, and segment science and health information in a flawless manner. Data scientists at Elsevier work on interesting projects, learning a lot about the world of data and its intricacies.
As a part of my study of the market, and as an Elsevier partner, I got in touch with four data scientists at Elsevier: Maya Hristakeva in New York, Dr. Daniel Kershaw and Dr. Harriet Muncey in London, and Deep Kayal in Amsterdam. They are all relatively new, which is why the trajectory that they have followed may resonate well with you.
I talked to these data scientists about the demands of the job and how being at Elsevier helps them develop advanced analytical skills.
Working towards the future
My first question was related to work on projects for the future of research. Maya, a data science manager, said Elsevier works with a lot of great content. And since data scientists sit at the centre of their hierarchy, they are responsible for using the applicable tools for extracting immediate results. The end result is to improve the customer experience.
Deep, a machine learning and analytics professional on the Content Enrichment and Information Extraction team, talked about Elsevier’s “gold mine” of information. The company publishes about 20 percent of the world’s scientific and medical content, and the data scientists working in the organization use all this information and more to extract key information so researchers and medical professionals can find relevant information faster. Deep also mentioned that the technology used within Elsevier is diversified.
“The technology usage in Elsevier is quite varied, and most of it is open-source, so one day we might use Python to write quick proof-of-concepts, while the next we can use Scala and big-data frameworks like Spark to put the proof-of-concept into production. In the technology stack, there are usual suspects like Python, Scala and Java for programming, Spark for data-processing, Docker for containerisation, and Kubernetes for orchestration, while in the machine learning stack, we leverage libraries like MLLib, Sklearn, Keras and TensorFlow to get our work done.”
Daniel, a research scientist with specialized knowledge in machine learning and recommender systems, is a senior data scientist in Elsevier’s Personalization and Discovery Services department. He said all data scientists at Elsevier are at the forefront of personalization. He talked about using high quality tools to achieve the goals Elsevier has in mind for the future.
Exciting real-life case studies
To delve even more into the life of a data scientist and the kind of work they do, I asked all four of them about the current projects they are working on.
Deep had been working on a recent project to automatically extract the names of funding agencies from numerous scientific manuscripts. He said the information was critical, as researchers are asked to declare who is funding their work. This information can help all related agencies track the funders and see their other funded work. Deep talked about training various models to extract the information.
Daniel told me that he and Maya had been working on a recommender system. The article recommendations for platforms like ScienceDirect are extremely important. These recommendations help users explore related articles that might appeal to their interest. They created Learning to Rank (LTR) models to study citation features and text similarity, among other things, so the system can recommend articles in a more precise and comprehensive manner.
Harriet, a senior data scientist who focuses on user understanding and business intelligence at Elsevier, said she was working on a model that predicts user behaviour. She explained how analysing such data sets has helped increase user traffic and monitor it more effectively.
The power of team collaboration
Then we talked about the importance of teamwork as a data scientist.
Maya said people’s diverse backgrounds help with idea generation, and that their brainstorming sessions generate positive dialogue and filter out ideas:
“People on the team bring different perspectives to the table. In our brainstorming sessions—collective intelligence—there is a lot of creative dialogue that goes on—a lot of trial and error to figure things out. Collaborating and working as a team, we can find a good solution faster. Data scientists work hand-in-hand with engineers and product owners as part of cross-functional teams to build the end-to-end product.”
Deep, Daniel and Harriet all agreed that teamwork plays an important role in the daily decision-making of a data scientist. Daniel said every person brings with them a unique skill set that others don’t have, so a team is a cumulative unit of people with individual characteristics and strengths.
Common challenges
It was imperative that we talked about some of the common challenges these data scientists face in their daily operations. All four of them agreed that data acquisition and cleaning is perhaps one of the biggest challenges.
Daniel added that figuring out how to set up a good evaluation framework to compare different machine learning models can also be quite challenging.
Harriet said data wrangling is a challenge she faces regularly:
“Data wrangling can be a challenge since we deal with many huge data sets with different characteristics and structures. Determining the most appropriate data sources and the best way to represent them can be tricky, but this becomes easier as our data lakes and pipelines mature. Improvements are continuously developed as we find new and interesting use cases to explore.”
Skills required for a career as a data scientist
To conclude our discussion, I asked all four about the skills that they think are necessary for a successful data scientist.
Maya said technical expertise and problem solving skills are a must; while good communication skills and the ability to work with a team are very important as well, the candidates need to be technically fit for the job.
Deep said that besides the general aptitude most candidates have, data scientists should know the basics of the software being used, have patience for the job, be foresighted enough to see the future, have business acumen and have a collaborative spirit.
Daniel believes that the following five skills define a data scientist:
- Technical knowledge
- Adaptability
- Continuous learning
- Communication skills
- Product understanding
While Harriet has no list of skills, she believes that all new employees should be curious enough to keep on learning. She said it is curiosity that drives innovation – something that Elsevier wants to enable even more.
You can learn more about becoming a data scientist by visiting Elsevier’s Technology Careers site.