Photo

Amar Sharma Data Engineer

toronto Linkedin
AirflowAWSAzureGitHubhadoopHDFShiveInfluxDBKubernetesOoziePysparkPythonSnowflakeSparkSpark-SqlSqoop

Amar Sharma

Sr. Developer | Standard Chartered

Cell +1 647-846-2484 | [email protected]
Overview
Total experience 5 Years out of which, Big Data Developer with 2.5+ years of experience executing data-driven solution to increase efficiency, accuracy and utility of external and internal data processing.

Experienced data migration (IBM Mainframe), MS-SQL Server, cleaning, and ETL process also worked with tools such as Hive, Sqoop, Oozie, Airflow, InfluxDB, Snowflake, Python, Spark-SQL, PySpark, Microsoft C++, Git, GitHub, Nagios and Microsoft Azure. Also, have understanding few of machine learning algorithm libraries. while working with the numerous types of data I have handled more than 500 petabytes of data with the help of more than 100 nodes of cluster

Provide an overall architect responsibility including roadmaps, leadership, planning ETC and Good experience also working with Cloud platform Microsoft Azure, Amazon Web Services (aws), Optum cloud and IBM Cloud.

Provide technical leadership and governance of the big data team and the implementation of the solution architecture in following Hadoop ecosystem (Hadoop (Hortonworks/Cloudera), Hive, TEZ, PySpark, Oozie, Hue ETC. Configure and tune production and development Hadoop environments with the various intermixing Hadoop components. Develop technical presentations & proposals & perform customer presentations. Written entire scripts in Jupyter notebook and pushed to client environment.

Mainframe developer with 2.5 years of experience writing Cobol programs and to execute prepared some of the JCL jobs as well as written and manage CICS front end screens. Handled high priority issues such as L2 and L3 incidents being a part of DevOps team.

Certification’s

Python – Data Science

Container & Kubernetes

Big Data Hadoop

Alteryx Core Designer – Alteryx

AHM 250

 

 

 

 

 

 

 

 

 

Experience
Standard Chartered – Aug’19 to Present

Associate Technical Manager (Big Data Hadoop & Analytics)
Managing 10 people and distributing the daily/Weekly Service request also managing the multiple deployment into PROD till Go live.

 

Develop concrete, detail plan for a project including the schedule, Budget outline the duties of each team member, identifying project goals and setting a timeline for project.

 

Monitor team to ensure that project remain on track, meet deadlines and stay under budget.

 

Translate business challenges into analytical problems and provide data-driven analytical solutions.

 

Managing the Portfolio level of requests across the vertical’s as well as handling multiple stake holders call from multiple territory like Singapore, Hongkong, Malaysia, Poland & China.

 

Piloted Proof of Concept (POC) to implement big data solution for Credit/Saving account.

 

Responsible to coordinate with conflicting project requirements with other stakeholders while defining the project scope

 

Designed big data solution using MS-Azure product to process, store and analyze 80TB of data.

 

Administered Hadoop cluster with 30 worker nodes and installed relevant MS Azure tools on multiple MS-Azure VM’s machines separately for test preparation.

 

Assessed business implications for each project phase and monitored progress to meet deadlines, standards and cost targets.

 

Responsible to set up weekly status meetings with the team and report the monthly status to program manager.

 

Create, validate and maintain scripts to load data using Sqoop manually/Automatically.

 

Migrate Oozie Jobs to Airflow and Create new Airflow workflows and coordinators to automate Sqoop jobs weekly and monthly.

 

Fetch data to/from Mainframe DB2, MS-SQL Server using Sqoop & Spark-SQL jobs.

 

Designed Hive tables to load data to and from external tables.

 

Writing DistCP shell scripts to load data across servers.

 

Run executive reports using Hive and Tableau View.

 

Involved in System Integration Testing and fixed bugs encountered during SIT.

 

Involved in UAT and fixed bugs encountered during UAT and Production Support activities.

 

Tool Set: InfluxDB, Airflow, Hive, Sqoop, Oozie, Jupyter Notebook, Control-M, Microsoft C++, Python, PySpark, Spark-SQL, Tableau, Git, GitHub, Nagios, Service Now, BMC (Service Request Management), HP-ALM, IBM CDC, Microsoft Azure (Power Shell, HD Insight, VPN, Load Balancer, Jump box, Network Security Group (NSG), BLOB, Data Factory)

PwC – Jan’19 to Jul’19
Sr. Hadoop Consultant/Developer
Client – Children’s Mercy Hospital (CMH)

Evaluate new technologies, execute proof-of-concept (POC) and develop specialized algorithms

 

Setup new environment into Azure also adding multiple virtual machines and connect each other.

 

Setup virtual private network into azure to secure the server and validate end-user access.

 

Connect multiple data sources to Azure network like SAS server, Active directory, Cerner, IBM Mainframe – OS390.

 

Fetch data from multiple sources and dump into Azure.

Tool Set: Azure Power Shell, Airflow, Spark SQL, HP-ALM, SQL Server 2016, Azure HD Insight, Jupyter, VPN, Azure Load Balancer, Jump box, Network Security Group (NSG).

 

Client – UnitedHealth Group (UHG)

Migration (Mainframe to Hadoop)

Present and persuade the design architecture to various stakeholders

 

Hive Query tuning, Data Encryption & decryption, and Data Analysis

 

Multi-Layer level Optimization/Staging Optimization and configuration from Data ingestion till final output as per the business model.

 

Working with multiple Hive formats ORC and Parquet for optimization, Conversion (CSV/Text to parquet / GZIP or ZLIB to Parquet or ORC), cross data validation and compression.

 

Compression and De-Compression millions of GZIP files and converted into Parquet

 

Investigate the root cause of reports of data issues

 

Design, implement and support an analytical data infrastructure providing ad-hoc access to large datasets and computing power

 

Collaborate with Business Intelligence Engineers (BIEs) to recognize and help adopt best practices in reporting and analysis: test design, analysis, validation, and documentation

 

 

Help continually improve ongoing reporting and analysis processes, automating or simplifying self-service support for customers

 

 

Experience providing technical leadership and mentoring other engineers for best practices on data engineering

 

Tool Set: Shell, Hive, PySpark, Sqoop, Oozie, Python, Spark SQL, Service Now, GitHub, Flat Files, SQL Server, Alteryx, Azure HD Insight with Hortonworks, Cloudera 2.3, Jupyter.

 

 

 

UnitedHealth Group – Sep’15 to May’18
Project 3 – Big Data Engineer (UMR) – Jul’17 – May’18
Mainframe migration & Part of Hadoop module installation & configuration

 

Prepare business model & functional and logical model

 

Created a new Hive table, Loading & Un-loading data from data lake or Flat files

 

Develop, refine and scale data management and analytics procedures, systems, workflows, and best practices

 

Work with product owners to establish design of experiment and the measurement system for effectiveness of product improvements

 

Work with Project Management to provide timely estimates, updates & status

 

Work closely with data scientists to assist on feature engineering, model training frameworks, and model deployments at scale

 

Work with the Product Management and Software teams to develop the features for the growing Optum business

 

Perform development and operations duties, sometimes requiring support during off-work hours

 

Work with application developers and DBAs to diagnose and resolve query performance problems

 

Tool Set: Shell, Hive, Sqoop, Oozie, Python, PySpark, Spark SQL, Tableau, IBM CDC, Cloudera 2.3, Tableau, Service now, HP-ALM.

 

Project 2 – Mainframe Developer – Feb’16 – Jun’17(UMR)
Involved in the implementation of 4+ major releases UMR application per the calendar year.

 

Actively involved in project development and maintenance for medical claim procession supporting region monitoring for abends – Resolving code modification, testing, prod implementation and create a stale prod environment.

 

Develop new and maintain existing UMR CICS screens.

 

Error elimination and abends reduced also handling various priority issues P1, P2, and P3 incidents.

 

Tool Set: COBOL, JCL, CICS, DB2, VSAM (Flat file), Service Now, HP-ALM, ADHOC, Endeavour, XPED

 

Project 1- Mainframe Developer – Sep’15 – Feb’16 (TOPS-DEV)
Manage daily transmission of patient information to business partners, including confirming claim submission and verifying data for insurance acceptance/rejection.

 

Meet with end users and department heads to gather and document requirements, establish technical specifications, and set deadlines and milestones.

 

Work directly with accounting and IT teams throughout integration testing.

 

Created reporting programs which eliminated erroneous and redundant submissions.

 

Coded, tested, debugged, implemented, and managed mainframe applications.

 

Worked with end-users to assess requirements and create feasible solutions.

 

Confirmed data accuracy after introducing new system.

 

Developed and maintained batch/online systems.

 

Tool Set: COBOL, JCL, CICS, DB2, VSAM (Flat file), Service Now, HP-ALM, ADHOC, Endeavour, XPED

 

 

 

 

Infosys – Feb’15 to Aug’15

Associate Mainframe Developer
Develop new COBOL program and develop new CICS screens

 

Resolved complex and conflicting design issues.

 

Supervised component and code test activities.

 

Executed clean ups from production data analysis.

 

Monitored various system processes as assigned.

 

Ensured complete and accurate issue tracking and reporting.

 

Escalated production timelines to meet client requests.

 

Identified production issues impacting code modules.

 

Tool Set: COBOL, JCL, DB2, VSAM Flat file, HP-ALM, ADHOC

Education
Bachelor of Computers Application – Full Time
Mahatma Gandhi University,

Date graduated: Dec 2014

Master of Computers Application – Distance
Jaipur National University,

Date graduated: Jun 2018