Our Big Data capability team is hiring technologists who can produce beautiful & functional code to solve complex analytics' problems. If you are an exceptional developer and who loves to push the boundaries to solve complex business problems using innovative solutions, then we would like to talk with you.
- Provides technical leadership in Big Data space (Hadoop Stack like M/R, HDFS, Pig, Hive, HBase, Flume, Sqoop, etc..NoSQL stores like Cassandra, HBase etc) across Fractal and contributes to open source Big Data technologies.
- Visualize and evangelize next generation infrastructure in Big Data space (Batch, Near RealTime, RealTime technologies).
- Evaluate and recommend Big Data technology stack that would align with company's technology
- Passionate for continuous learning, experimenting, applying and contributing towards cutting edge open source technologies and software paradigms
- Drive significant technology initiatives end to end and across multiple layers of architecture
- Provides strong technical leadership in adopting and contributing to open source technologies related to BigData across the company.
- Provide strong technical expertise (performance, application design, stack upgrades) to lead Platform Engineering
- Define and Drive best practices that can be adopted in Big Data stack. Evangelize best practices across teams and BUs.
- Provide technical leadership and be a role model to data engineers pursuing technical career path in engineering
- Provide/inspire innovations that fuel the growth of Fractal
QUALIFICATIONS & EXPERIENCE
7 - 11 years of demonstrable experience designing technological solutions to complex data problems, developing & testing modular, reusable, efficient and scalable code to implement those solutions.
Ideally, This Would Include Work On The Following Technologies
- Expert-level proficiency in at-least one of Java, C++ or Python (preferred). Scala knowledge a strong advantage.
- Strong understanding and experience in distributed computing frameworks, particularly Apache Hadoop 2.0 (YARN; MR & HDFS) and associated technologies -- one or more of Hive, Sqoop, Avro, Flume, Oozie, Zookeeper, etc.Hands-on experience with Apache Spark and its components (Streaming, SQL, MLLib) is a strong advantage.
- Operating knowledge of cloud computing platforms (AWS, especially EMR, EC2, S3, SWF services and the AWS CLI)
- Experience working within a Linux computing environment, and use of command line tools including knowledge of shell/Python scripting for automating common tasks
- Ability to work in a team in an agile setting, familiarity with JIRA and clear understanding of how Git works.
- A technologist - Loves to code and design
In addition, the ideal candidate would have great problem-solving skills, and the ability & confidence to hack their way out of tight corners.
- Java or Python or C++ expertise
- Linux environment and shell scripting
- Distributed computing frameworks (Hadoop or Spark)
- Cloud computing platforms (AWS)
Desirable Experience (would Be a Plus)
- Statistical or machine learning DSL like R
- Distributed and low latency (streaming) application architecture
- Row store distributed DBMSs such as Cassandra
- Familiarity with API design
- B.E/B.Tech/M.Tech in Computer Science or related technical degree OR Equivalent