Free Deep Dive "Digitising German SMEs Together" on 26.10. Register now
Sponsor
Narwhal Data Solutions

Narwhal Data Solutions

DevOps, Web Development, Data Engineering, Data Science
4.8
2 reviews
To the website Send message
Narwhal Data Solutions isn't just a consultancy; we're your trusted partner in unlocking the true potential of your data. Our journey began with a simple yet powerful mission: to provide top-tier, vendor-independent data engineering, analytics, and data science services. Our goal is to make sense out of data and assist you in creating business value out of them. Our team of highly skilled engineers is your comprehensive solution for all things data-related. From building scalable and sustainable data processing and analytics systems to crafting customised business intelligence solutions, we've got you covered. Our service offerings: 1. big data analytics 2. data warehouse 3. data engineering 4. artificial intelligence 5. consulting Whether it's on-premises or in the cloud, we create robust data infrastructure and deliver scalable data processing pipelines. With extensive experience across major cloud providers and cutting-edge data processing technologies, we guide you in making the right technology choices and architecting end-to-end solutions. At Narwhal Data Solutions, we deal in facts, not guesswork. Our mission is to empower companies with actionable information, enabling precision in decision-making. Let us help you build data systems the right way!
Daily rate
760€/day
Employees
6 employees in total
Company type
Established
Homepage
https://narwhaldatasolutions.com/
Location
Berlin

References

Customer Service Analytics

Transport company
Transport and Logistics

Verified ratings

Communication
Adherence to deadlines
Quality

10/2022 - 11/2023

100,000 euros

Berlin / Munich

Request similar project

Project description

The customer service unit in the company faced significant challenges with its existing reporting platform, leading to a lack of trust in the reported numbers among stakeholders. As it turned out the stakeholders had different perspectives and understanding of processes and metrics. Dispersed and outdated technical and business documentation contributed to the inconsistency. To address these issues and become a truly data-driven organisation, the company needed a solid foundation for reliable analytics and educated decision making.

Narwhal Data Solutions formed a small team of experienced data architects and engineers. Working closely with the client's management and technical staff, the team took a comprehensive approach to tackle the problems. The first step involved gathering and consolidating all documentation into a unified repository. Business and technical definitions, assumptions, and processes were meticulously documented, and a unified KPI library was created to establish precise and unambiguous metric definitions. Through collaboration with critical stakeholders, a shared understanding and consensus were achieved.

A modern cloud technology stack was employed, including dbt (Data Build Tool), Snowflake, and Power BI. Using the Kimball star schema, we built a data warehouse, which in turn served as a single data source for the reporting. The implementation process revealed data inconsistencies and quality issues in the source systems, which were addressed in close collaboration with the client's teams. Clear data contracts were established and enforced through automated tests to ensure data integrity and early detection of issues.

The successful implementation of the solution delivered significant improvements to the organisations' business intelligence capabilities. Data quality and reporting integrity were greatly enhanced, providing stakeholders with accurate and reliable information. The new data warehouse served as a solid foundation for reporting and analytics, while comprehensive documentation, tests, and CI/CD pipelines facilitated maintenance and future enhancements.

With a competent business intelligence team now in charge of maintaining and developing the platform, Narwhal Data Solutions continues to provide support and guidance as needed.

Key Performance Indicator
Data Engineering
dbt
Data warehouse
Kimball
Customer Service
CI/CD
Client Management
Data Integration
Business Intelligence
Snowflake
Microsoft Power BI
Collaboration

Analytics platform for e-logistics

ViaEurope
Transport and Logistics

Verified ratings

Communication
Adherence to deadlines
Quality

03/2021 - 09/2022

250,000 euros

Berlin / Amsterdam

Request similar project

Comment

We hired Narwhal Data Solutions to help us set up a data warehouse using AWS RDS, dbt, for consumption by Microsoft PowerBI.

Marek Strzelczyk and his team have been nothing but professional, supportive, flexible, polite and friendly. With their help we've been able to set up a comprehensive reporting and analytics platform for our logistics and ecommerce data. Their experience and advice on the setup of the ETL pipeline and the general data architecture have been invaluable.

Project description

A rapidly expanding scale-up faced the challenge of obtaining relevant and accurate data to support decision-making processes across various departments. The existing legacy business intelligence (BI) system relied on individual reports accessing a transactional database, leading to redundant and complex logic and performance issues as data volume grew. To address these challenges, the company partnered with Narwhal Data Solutions to establish a dedicated data team and develop a robust data platform.

The solution involved implementing a cloud data warehouse infrastructure on AWS, utilising PostgreSQL as the open-source technology of choice. Through workshops, the team gathered information on key business processes, data sources, and reporting requirements. This knowledge was used for the architecture design of the new data platform. PowerBI was selected as the reporting tool, including its recently released Datamarts feature.

Following the Kimball methodology, the data team modeled complex logistic processes using a star schema composed of multiple facts and dimensions. Native PostgreSQL methods were used for data loading, and dbt (Data Build Tool) facilitated data transformations, documentation, and data tests. Performance optimization was a crucial aspect given the large volume of data (~1 TB), high expectations for data freshness and the scalability requirements.

The implementation of the data platform brought significant improvements to the client's business. The centralized cloud data warehouse provided timely delivery of relevant, accurate, and consistent information. PowerBI empowered users at all levels to make informed decisions confidently. Data pipelines were designed to refresh the data every 4 hours, with the entire data load and transformation process taking slightly over 1 hour for the 1 TB data warehouse. Users had access to the most up-to-date information from the current day.

The implemented data platform sets the foundation for future enhancements and scalability as the company continues to evolve. It accommodates increasing data demands and enables streamlined reporting processes.

dbt
Data warehouse
Data Pipeline
AWS
Kimball
Business Intelligence
Performance Optimization
Microsoft Power BI
PostgreSQL

Data platform optimisation

Evaluation

No rating available

06/2021 - 02/2022

100,000 euros

Hamburg

Request similar project

Project description

A sport tech startup operated a data platform with Amazon Redshift as a central piece, supported by Amazon Glue and PySpark. The ETLs were orchestrated by Airflow.

The DWH served both the internal business intelligence analytics and provided the analytical data used by one of the digital products of the company. As the data volumes grew the performance issues became evident. The ambitious but small data team lacked both man-power and deep technical expertise in some areas to tackle the problems on the long run.

They made a wise decision to get an external consultant from Narwhal Data Solutions to work on the improvements to the existing system.

We identified some flaws in the existing data infrastructure, which led to both poor performance and unnecessary operating costs. After a careful analysis of the data and the typical queries we implemented new data partitioning scheme, which resulted in more balanced data distribution, reduced skewness and better performance of the cluster. Improved ETL processes furhter reduced the cluster load. The average query execution time was improved by a factor of 2-3x, while the cases of some queries running unacceptably long were practically completely eliminated.

Next, we proposed changes in the data architecture, which could further improve the performance of the system even with the growing data volume. At the same time the proposed changes could lead to a significant cost reductions thanks to reducing the Redshift cluster size and moving large part of the workload to PostgreSQL - a database more suitable for the observed usage pattern. An MVP was created to showcase the operating principles and demonstrate the performance and scalability potential of the solution. Then it was handed over to the internal team for the further development.

Data Architectures
SQL
ETL
Data warehouse
Business Analytics
PostgreSQL
Apache Airflow
Spark SQL
Amazon Redshift
PySpark
Business Intelligence
Performance Optimization
Minimum Viable Product

Scalable data platform

Energy

Evaluation

No rating available

12/2020 - 02/2021

25,000 euros

Zurich

Request similar project

Project description

A startup in renewable energy sector collected and analysed the data originated from SCADA systems. The data infrastructure and the analytics platform was operated on Azure, with a DWH build according to Data Vault methodology running on SQL Server as a centrepiece. Due to the requirements to present part of the analysed data in near real time the data were loaded and processed continuously (stream / mini-batches).

Despite a relatively low volume of data (less than 1TB) the performance issues started to be visible. The company expected a rapid growth and wanted to prepare its data platform for scaling up by a factor of 100x in the near future. They asked Narwhal Data Solutions to review the current system, the existing requirements and performance issues, and ultimately propose a new architecture of the scalable system.

Several options for the architecture were proposed, starting from an improved version of the existing system, but migrated to SQL Hyperscale, to a stream processing system based on Apache Kafka and kSQL DB or Flink. Several other technology options has been carefully evaluated. Both functional and non-functional requirements were taken into account, pros and cons of the proposed architectures were highlighted, the TCO estimated and implementation / migration roadmap drafted.

Azure
Energy
SQL
Data warehouse
Data Vault
Flink
Apache Kafka
Migration
SQL Server
Roadmapping

Macroeconomic Data Integration using Camel K

Evaluation

No rating available

01/2023 - 09/2023

Request similar project

Project description

Objective

We aimed to streamline the integration of macroeconomic data for a client in the finance and analytics sector.

Challenges

Our client needed efficient consolidation of macroeconomic statistics from various sources and automation of data collection and storage.

Solution

We leveraged Apache Camel K, a versatile integration platform, along with Kubernetes for orchestration.

  • Benefits of Camel K: Allows integration routes to be written in multiple languages, making it flexible for developers. It streamlined the complex movement of macroeconomic data and adapted to changing data formats and sources.
  • Quarkus Optimization: We used Quarkus, a Java framework, to enhance system performance with fast startup times and efficient resource utilization.
  • CI/CD Deployment: We implemented a CI/CD pipeline for automated deployment of Camel K integration routes, ensuring data integrity and version control.
  • Deployment Methods: We had two deployment approaches - Argo CD for a structured, declarative deployment, and Kamel CLI for a more interactive and hands-on approach.

Result

The project revolutionized data integration, making it efficient, reliable, and adaptable to changes in data sources and formats.

Main focus

Data Lake
Data Architectures
Data Engineering
Data warehouse
Data Governance
Big Data
Data Science
Data Analysis

Other skills

Key Performance Indicator
SQL
Statistics
ETL
Data acquisition
Stakeholder Management
Data Management
Apache Kafka
GitHub
Spark
GCP
Software Design
Apache Airflow
Dashboard
Snowflake
Tableau
System Design
Hadoop
SAP Analytics Cloud
Data Pipeline
Geo Data
Business Analytics
Data Integration
Data Visualization
Data Migration
KPI
Business Intelligence
Database Systems
Microsoft Power BI
Python
Amazon Relational Database Services
Kimball
Solution Architecture
Open Source
PostgreSQL
Database Design
Big Data Integration
Data validation
Git
Azure Data Factory
Data Modeling
Relational Databases
Azure
AWS
Data Vault
AWS Glue
GitLab
Scala
SQL Server
Oracle
SAP BW
Software Development
Metabase
Database Programming
Java
User Stories
PySpark
Amazon Redshift
Software Engineering
Azure Synapse
SAP BO
OLAP
+50

Industries

Research
0 - 10 projects
Internet and IT
0 - 10 projects
Transport and Logistics
0 - 10 projects

Dein persönlicher Ideen- und Beratungsassistent

Nutze unseren KI-Bot, um gezielt Fragen zu diesem Dienstleister zu stellen, Inspiration für dein Projekt zu sammeln oder passende Alternativen zu finden. Schnell, einfach und rund um die Uhr für dich da!

en_GBEnglish

Request similar project

Are you impressed by the project? Would you like to realise something similar? Share your vision with us now.

Send message

Do you have questions, ideas or need support? The service provider is just a click away and ready to help and advise you.