Data Engineer, Analytics, NCBA

June 15, 2025

Job Overview

  • Date Posted
    June 15, 2025
  • Location
  • Expiration date
    June 22, 2025
  • Experience
    5 Years
  • Gender
    Both
  • Qualification
    Bachelor Degree
  • Career Level
    Mid-Level / Officer

Job Description

JOB PURPOSE STATEMENT

The Data Engineering team is responsible for documenting data models, architecting distributed systems,
creating reliable data pipelines, combining data sources, architecting data stores, and collaborating with
the data science teams to build the right solutions for them.
They do this by using the Open Source Big Data platforms such as Apache NIFI, Kafka, Hadoop, Apache
Spark, Apache Hive, HBase, Druid and the Java programming language, while picking the right tool for
each purpose.
The growth of every product relies heavily on data, such as for scoring and for studying product behavior
that may be used for improvement, and it is the role of the data engineer to build a fast and horizontally
scalable architectures using modern tools that are not the traditional Business Intelligence systems as we
know them.

KEY RESPONSIBILITIES & PERCENTAGE (%) TIME SPENT

  • Documenting Data Models: The role will be responsible for documenting the entire journey that data
    elements take end-to-end, from the data sources to the all the data stores, including all the
    transformations in between, and maintaining those documents up to date with every change. (10%).
  • Architecting Distributed Systems: Modern data engineering platforms are distributed systems. The data
    engineer designs the right architecture for each solution, while utilizing best-of-breed Open Source
    tools in the big data ecosystem because there is no one solution that does everything; the tools are
    specialized and are made lean and fit for purpose. The architecture should be one that can process
    any data, Any Time, Any Where, Any Workload. (10%).
  • Combining Data Sources: pulling data from different sources, which could be structured, semistructured or unstructured data using tools such as Apache NIFI and taking the data through a journey
    that will create a final state that is useful to the data consumers. These sources can be REST, JDBC,
    Twitter, JMS, Images, PDF, MS Word and put the data into a staging environment such as Kafka topics
    for onward processing. (10%).
  • Developing Data Pipelines: creating data pipelines that will transform data using tools such as
    Apache Spark and the Java programming language. The pipelines may apply processing such as
    machine learning, aggregation, iterative computation, and so on. (40%).
  • Architecting Data Stores: Designing and creating data stores using big data platforms such as
    Hadoop, and the NoSQL databases such as HBase. (15%).
  • Data Query and Analysis: Utilizing tools such as Apache Hive to analyze data in the data stores to
    generate business insights. (10%)
  • Team Leadership: Providing team leadership to the data engineers. (5%)

QUALIFICATION AND EXPERIENCE REQUIREMENTS

  • A Bachelor’s degree in Computer Science
  • Minimum 5 years’ experience developing object oriented applications using the Java programming language
  • Certification and experience implementing best practice frameworks e.g. ITIL, PRINCE2, preferred
  • Minimum 5 years’ experience working with relational databases
  • Minimum 5 years’ experience working with the Linux operating system
  • Experience with Open Source Big Data Platforms and tools (Hadoop, Kafka, Apache NIFI,
    Apache Spark, Apache Hive, NoSQL databases) and ODI
  • Experience working with Data Warehouses
  • Experience with DevOps, Agile working and CICD
  • Familiarity with complex systems integrations using SOA tools (Oracle Weblogic/ESB/SOA)
  • Familiarity with industry standard formats and protocols (JMS, SOAP, XML/XPath/XQuery, REST and
    JSON) and data sources
  • Excellent analytical, problem solving and reporting skills
  • A good knowledge of the systems and processes within Financial Services industry