Lead Site Reliability Engineer Information Technology (IT) - Boylston, MA at Geebo

Lead Site Reliability Engineer


Job Description:
Join a team of more than 30,000 team members, comprised of our Club Support Center and over 230 clubs and 7 distribution centers. We re committed to delivering value and convenience to our Members, helping them save every day on everything they need for their families and homes. BJ s Wholesale Club offers a collaborative, team environment where all team members can learn, grow and be themselves. The Benefits of working at BJ s BJ s pays weekly Generous time off programs to support busy lifestyles o Vacation, Personal, Holiday, Sick, Bereavement Leave, Jury Duty Benefit plans for your changing needs o Three medical plans , Health Reimbursement Account (HRA), Health Savings Account (HSA), two dental plans, flexible spending eligibility requirements vary by position medical plans vary by location Reliability Engineering (SRE) is an engineering discipline that combines software engineering and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. A Lead SRE within the BJs Digital team is a hands-on role focusing on increasing our tooling and automation and improving resilience and availability of digital platform applications and processes through SRE principles and practices. Major Tasks, Responsibilities, and Key AccountabilitiesHas end-to-end availability, security and performance of mission-critical applications and services that are part of the Digital systemsAnalyze technical issues and identify the root cause and provide fix in production environment. (Never solve the same problem twice)Partners with multiple internal teams to groom the nonfunctional requirements and work on implementationsAutomate or streamline manual tasks and redundancies within the infrastructure organizationImplement best SRE practices to ensure availability/reliability and fault tolerance and wherever applicableBe the SRE ambassador on an Agile software development team. Drive product reliability improvements through monitoring, alerting, and application of software development best practices. Identify creative ways to break the system, uncover and report nonfunctional defects, as well as validate systems/solutions are operating as intended. Perform proof of concepts to proof new technologies and integrationsAble to work fast and reliably under pressureA strong critical thinker who identifies problems before they happenTroubleshoot performance and stability issues using a wide variety of toolsEvaluate and manage application and environment securityShare off hours on call with team for any production issues QualificationsBachelor s degree in Computer Science or related field with continuous and progressive experience6
Years of total IT experience ( 2
years experience in development roles and 3
years experience in Reliability engineering role)Hands on experience with performance analysis, scalability, and reliability testing techniquesExperience with APM tools (New relic, Dynatrace or similar tools), and log monitoring tools (Splunk , scalyr or similar tools)Strong knowledge and hands on experience with Linux, SQL, and Shell scriptingFamiliarity with object-oriented programming languages (Java) and concepts and hands on experience in Java applications (spring boot services)Hands on experience with any cloud service concepts, preferably AWSHands on experience with SRE practices and writing, running Chaos engineering experimentsKnowledge on HCL commerce and IBM sterling platforms is advantageousA strong critical thinker who identifies problems before they happenStrong written and oral communication skills with a high degree of comfort speaking with engineering management, developers, and leadershipDemonstrated ability to adapt to new technologies and learn quickly Nice To HavePrevious experience in an eCommerce based companyExperience implementing CI/CD Blue/Green Deployments using CI/CD Environmental Job ConditionsSupport and maintain globally distributed, multi-cloud (public and/or private) environmentsAutomate common, repeatable tasks at large scale to streamline operational proceduresFollow change management processes during implementationsUse and maintain version control for application infrastructureWork in a diverse and global team environmentCross-train with other global team membersParticipate in an on-call rotation as requiredPromote the DevOps/SRE mindset Estimated Salary: $20 to $28 per hour based on qualifications.

Don't Be a Victim of Fraud

  • Electronic Scams
  • Home-based jobs
  • Fake Rentals
  • Bad Buyers
  • Non-Existent Merchandise
  • Secondhand Items
  • More...

Don't Be Fooled

The fraudster will send a check to the victim who has accepted a job. The check can be for multiple reasons such as signing bonus, supplies, etc. The victim will be instructed to deposit the check and use the money for any of these reasons and then instructed to send the remaining funds to the fraudster. The check will bounce and the victim is left responsible.