How Fujitsu carried out a worldwide knowledge mesh structure and democratized knowledge


It is a visitor submit co-authored with Kanehito Miyake, Engineer at Fujitsu Japan. 

Fujitsu Restricted was established in Japan in 1935. At present, now we have roughly 120,000 workers worldwide (as of March 2023), together with group corporations. We develop enterprise in numerous areas all over the world, beginning with Japan, and supply digital providers globally. To supply a wide range of merchandise, providers, and options which might be higher suited to clients and society in every area, now we have constructed enterprise processes and methods which might be optimized for every area and its market.

Nonetheless, lately, the IT market setting has modified drastically, and it has turn into troublesome for all the group to reply flexibly to the person market scenario. Furthermore, we’re challenged not solely to revisit particular person merchandise, providers, and options, but in addition to reinvent total enterprise processes and operations.

To rework Fujitsu from an IT firm to a digital transformation (DX) firm, and to turn into a world-leading DX associate, Fujitsu has declared a shift to data-driven administration. We constructed the OneFujitsu program, which standardizes enterprise initiatives and methods all through the corporate, together with the home and abroad group corporations, and tackles the foremost transformation of all the firm beneath this system.

To attain data-driven administration, we constructed OneData, an information utilization platform used within the 4 international AWS Areas, which began operation in April 2022. As of November 2023, greater than 200 initiatives and 37,000 customers had been onboarded. The platform consists of roughly 370 dashboards, 360 tables registered within the knowledge catalog, and 40 linked methods. The info dimension saved in Amazon Easy Storage Service (Amazon S3) exceeds 100 TB, together with knowledge processed to be used in every undertaking.

On this submit, we introduce our OneData initiative. We clarify how Fujitsu labored to resolve the aforementioned points and introduce an summary of the OneData design idea and its implementation. We hope this submit will present some steerage for architects and engineers.

Challenges

Like many different corporations combating knowledge utilization, Fujitsu confronted some challenges, which we talk about on this part.

Siloed knowledge

In Fujitsu’s lengthy historical past, we restructured organizations by merging affiliated corporations into Fujitsu. Though organizational integration has progressed, there are nonetheless many methods and mechanisms custom-made for particular person context. There are additionally many methods and mechanisms overlapping throughout completely different organizations. Because of this, it takes loads of effort and time to find, search, and combine knowledge when analyzing all the firm utilizing a standard normal. This example makes it troublesome for administration to know enterprise tendencies and make choices in a well timed method.

Below these circumstances, the OneFujitsu program is designed have one system per one enterprise globally. Core methods resembling ERP and CRM are being built-in and unified so as to not have silos. It can make it simpler for customers to make the most of knowledge throughout completely different organizations for particular enterprise areas.

Nonetheless, to unfold a tradition of data-driven decision-making not solely in administration but in addition in each group, it’s essential to have a mechanism that permits customers to simply uncover numerous varieties of knowledge in organizations, after which analyze the info rapidly and flexibly when wanted.

Excel-based knowledge utilization

Microsoft Excel is offered on virtually everybody’s PC within the firm, and it helps decrease the hurdles when beginning to make the most of knowledge. Nonetheless, Excel is principally designed for spreadsheets; it’s not designed for large-scale knowledge analytics and automation. Excel recordsdata are likely to comprise a mix of knowledge and procedures (capabilities, macros), and plenty of customers casually copy recordsdata for one-time use instances. It introduces complexity to maintain each knowledge and procedures updated. Moreover, it tends to require domain-specific information to handle the Excel recordsdata for particular person context.

For these causes, it was extraordinarily troublesome for Fujitsu to handle and make the most of knowledge at scale with Excel.

Answer overview

OneData defines three personas:

  • Writer – This function contains the organizational and administration group of methods that function knowledge sources. Obligations embody:
    • Load uncooked knowledge from the info supply system on the acceptable frequency.
    • Present and preserve updated with technical metadata for loaded knowledge.
    • Carry out the cleaning course of and format conversion of uncooked knowledge as wanted.
    • Grant entry permissions to knowledge primarily based on the requests from knowledge customers.
  • Client – Shoppers are organizations and initiatives that use the info. Obligations embody:
    • Search for the info for use from the technical knowledge catalog and request entry to the info.
    • Deal with the method and conversion of knowledge right into a format appropriate for their very own use (resembling fact-dimension) with granted referencing permissions.
    • Configure enterprise intelligence (BI) dashboards to offer data-driven insights to end-users focused by the patron’s undertaking.
    • Use the most recent knowledge printed by the writer to replace knowledge as wanted.
    • Promote and broaden using databases.
  • Basis – This function encompasses the info steward and governance group. Obligations embody:
    • Present a preprocessed, generic dataset of knowledge generally utilized by many shoppers.
    • Handle and information metrics for the standard of knowledge printed by every writer.

Every function has sub-roles. For instance, the patron function has the next sub-roles with completely different tasks:

  • Information engineer – Create knowledge course of for evaluation
  • Dashboard developer – Create a BI dashboard
  • Dashboard viewer – Monitor the BI dashboard

The next diagram describes how OneData platform works with these roles.

Let’s have a look at the important thing elements of this structure in additional element.

Writer and client

Within the OneData platform, the writer is per every knowledge supply system, and the patron is outlined per every knowledge utilization undertaking. OneData gives an AWS account for every.

This permits the writer to cleanse knowledge and the patron to course of and analyze knowledge at scale. As well as, by correctly separating knowledge and processing, it turns into easy for the groups and organizations to share, handle, and inherit processes that had been historically confined to particular person PCs.

Basis

When the groups don’t have a strong sufficient skillset, it may require extra time to mannequin and course of knowledge, and trigger longer latency and decrease knowledge high quality. It might additionally contribute to decrease utilization by end-users. To handle this, the inspiration function gives an already processed dataset as a generic knowledge mannequin for knowledge generally use instances utilized by many shoppers. This permits high-quality knowledge accessible to every client. Right here, the inspiration function takes the lead in compiling the information of area specialists and making knowledge appropriate for evaluation. It is usually an efficient method that eliminates duplicates for shoppers. As well as, the inspiration function screens the state of the metadata, knowledge high quality indicators, knowledge permissions, data classification labels, and so forth. It’s essential in knowledge governance and knowledge administration.

BI and visualization

Particular person shoppers have a devoted area in a BI device. Prior to now, if customers wished to transcend easy knowledge visualization utilizing Excel, they needed to construct and preserve their very own BI instruments, which prompted silos. By unifying these BI instruments, OneData lowers the issue for shoppers to make use of BI instruments, and centralizes operation and upkeep, attaining optimization on a company-wide scale.

Moreover, to maintain portability between BI instruments, OneData recommends customers remodel knowledge throughout the client AWS account as a substitute of reworking knowledge within the BI device. With this method, BI device hundreds knowledge from AWS Glue Information Catalog tables by an Amazon Athena JDBC/ODBC driver with none additional transformations.

Deployment and operational excellence

To supply OneData as a standard service for Fujitsu and group corporations all over the world, Regional OneData has been deployed in a number of places. Regional OneData represents a unit of system configurations, and is designed to offer decrease community latency for platform customers, and be optimized for native languages, working hours for system operations and assist, and region-specific authorized restrictions, resembling knowledge residency and private data safety.

The Regional Operations Unit (ROU), a digital group that brings collectively members from every area, is accountable for working regional OneData in every of those areas. OneData HQ is accountable for supervising these ROUs, in addition to planning and managing all the OneData.

As well as, now we have a specifically positioned OneData referred to as International OneData, the place international knowledge utilization spans every area. Solely the correctly cleansed and sanitized knowledge is transferred between every Regional OneData and International OneData.

Methods resembling ERP and CRM are accumulating knowledge as a writer for International OneData, and the dashboards for executives in numerous areas to observe enterprise situations with international metrics are additionally appearing as a client for International OneData.

Technical ideas

On this part, we talk about among the technical ideas of the answer.

Giant scale multi-account

We now have adopted a multi-account technique to offer AWS accounts for every undertaking. Many publishers and shoppers are already onboarded into OneData, and the quantity is anticipated to extend sooner or later. With this technique, future utilization enlargement at scale could be achieved with out affecting the customers.

Additionally, this technique allowed us to have clear boundaries in safety, prices, and repair quotas for every AWS service.

All of the AWS accounts are deployed and managed by AWS Organizations and AWS Management Tower.

Serverless

Though we offer unbiased AWS accounts for every writer and client, each operational prices and useful resource prices can be monumental if we accommodated particular person person requests, resembling, “I desire a digital machine or RDBMS to run particular instruments for knowledge processing.” To keep away from such steady operational and useful resource prices, now we have adopted AWS serverless providers for all of the computing assets vital for our actions as a writer and client.

We use AWS Glue to preprocess, cleanse, and enrich knowledge. Optionally, AWS Lambda or Amazon Elastic Container Service (Amazon ECS) with AWS Fargate may also be used primarily based on preferences. We permit customers to arrange AWS Step Features for orchestration and Amazon CloudWatch for monitoring. As well as, we offer Amazon Aurora Serverless PostgreSQL as normal for shoppers, to fulfill their wants for knowledge processing with extract, load, and remodel (ELT) jobs. With this method, solely the patron who requires these providers will incur costs primarily based on utilization. We’re capable of reap the benefits of decrease operational and useful resource prices because of the distinctive advantage of serverless (or extra precisely, pay-as-you-go) providers.

AWS gives many serverless providers, and OneData has built-in them to offer scalability that enables energetic customers to rapidly present the required functionality as wanted, whereas minimizing the fee for non-frequent customers.

Information possession and entry management

In OneData, now we have adopted an information mesh structure the place every writer maintains possession of knowledge in a distributed and decentralized method. When the patron discovers the info they wish to use, they request entry from the writer. The writer accepts the request and grants permissions solely when the request meets their very own standards. With the AWS Glue Information Catalog and AWS Lake Formation, there isn’t a have to replace S3 bucket insurance policies or AWS Id and Entry Administration (IAM) insurance policies each time we permit entry for particular person knowledge on an S3 knowledge lake, and we are able to effortlessly grant the required permissions for the databases, tables, columns, and rows when wanted.

Conclusion

For the reason that launch of OneData in April 2022, now we have been persistently finishing up instructional actions to broaden the variety of customers and introducing success tales on our portal website. In consequence, now we have been selling change administration throughout the firm and are actively using knowledge in every division. Regional OneData is being rolled out progressively, and we plan to additional broaden the dimensions of use sooner or later.

With its international enlargement, the event of primary capabilities as an information utilization platform will attain a milestone. As we transfer ahead, it is going to be vital to guarantee that OneData platform is used successfully all through Fujitsu, whereas incorporating new applied sciences associated to knowledge evaluation as acceptable. For instance, we’re making ready to offer extra superior machine studying capabilities utilizing Amazon SageMaker Studio with OneData customers and investigating the applicability of AWS Glue Information High quality to scale back the handbook high quality monitoring efforts. Moreover, we’re at present within the technique of implementing Amazon DataZone by numerous initiatives and efforts, resembling verifying its performance and inspecting the way it can function whereas bridging the hole between OneData’s current processes and to the perfect course of we’re aiming for beliefs.

We now have had the chance to debate knowledge utilization with numerous companions and clients and though particular person challenges might differ in dimension and its context, the problems that we’re at present attempting to resolve with OneData are frequent to a lot of them.

This submit describes solely a small portion of how Fujitsu tackled challenges utilizing the AWS Cloud, however we hope the submit gives you some inspiration to resolve your personal challenges.


In regards to the Writer


Kanehito Miyake is an engineer at Fujitsu Japan and accountable for OneData’s answer and cloud structure. He spearheaded the architectural research of the OneData undertaking and contributed tremendously to selling knowledge utilization at Fujitsu together with his experience. He loves rockfish fishing.

Junpei Ozono is a Go-to-market Information & AI options architect at AWS in Japan. Junpei helps clients’ journeys on the AWS Cloud from Information & AI facets and guides them to design and develop data-driven architectures powered by AWS providers.

Recent Articles

Related Stories

Leave A Reply

Please enter your comment!
Please enter your name here

Stay on op - Ge the daily news in your inbox