Network Systems Reliability Operations Engineer

March 31, 2024

Apply for this job

Email *
Password *
Confirm Password *

Job Description

Intitulé de l’annonce du poste : Network Systems Reliability Operations Engineer Demande d?identifiant : 10080333 Description du poste : Network Systems Reliability Operations Engineer Job Summary: The Walt Disney Company is a world-class entertainment and technological leader. Walt?s passion was to continuously envision new ways to move audiences around the world?a passion that remains our… touchstone in an enterprise that stretches from theme parks, resorts, and a cruise line to sports, news, movies, and a variety of other businesses. Uniting each endeavor is a dedication to creating and delivering unforgettable experiences ? and we are constantly looking for new ways to improve these exciting experiences. Enterprise Technology is responsible for strategy, architecture, engineering, operations, and support of IT for The Walt Disney Company including digital worlds, flagship Web sites Disney.com, Disney Family.com, ABC.com, and ESPN.com. Enterprise Technology provides services in both platform engineering and web operations. Platform engineering includes Content Management Systems, E-commerce, Video Distribution Technology, Registration, Advertising Systems, and Data Warehouse & Reporting. The Disney Technology Operations Command Center (DTOC) is a 24x7x365 mission-critical services operation center responsible for service availability, with primary focus to rapidly respond to, correlate for, and reduce impact of outages. We are accountable for identifying and facilitating the resolution of service-impacting events, and collaborating with other technology teams to prevent future impact through proactive event management, incident and problem analysis. The DTOC drives the execution of the major incident process including communication to executives and key stakeholders. The DTOC owns and executes the IT Emergency Operations Center Crisis Management plan and process, with responsibility for maturing the plan and its integration into the overall Corporate Crisis Management and TWDC programs. The DTOC also provides ongoing first and second-level technical support of requests, performs validation procedures for routine system/service checks, and fulfills proactive monitoring with communication for HyperCare of significant business events. The Network SRO Engineer will provide operational oversight and technical leadership and is responsible for monitoring, identifying, and coordinating with other technologists across segments to fine-tune system operations rallying to resolve service interruptions. This role is responsible for the end-to-end reliability and operations of IT services and performing consultations and training to other clients and segments within TWDC. The SRO Engineer will examine IT systems for defects and communicate maintenance schedules and critical events across the company. Working with Engineers and Analysts at all levels and the SRO will interact with computer and software engineers, quality control specialists, infrastructure service leads, segment technologists, and others to ensure service availability, increase efficiency, and establish best practices for the execution and continuous improvement of the Event, Incident, Major Incident, Crisis Management, Hypercare execution, and Problem Management processes within the DTOC. Additionally, this position will drive service improvement initiatives through proactive monitoring and enhancement actions from gaps identified through analytics and problem management. The SRO engineer is an active member of the DTOC service team focused on Operations, but ensuring the operations sustainability by contributing to the development, testing, and evaluation of services supported. Leverage partnerships with the Business, Customer base and the Suppliers to successfully deliver services to meet agreed upon expectations. Provides 24x7x365 first point-of-contact for centralized incident response and recovery that consistently and reliably triages reported or automated incidents, applies recovery procedures, and engages domain experts to restore steady-state operations; provides all core services on a priority basis and with dedicated support to ensure the success of critical events. SROE team holds accountability for the availability and stability of the network services. This includes Data Center networks, Wide Area and Metro Area networks, Wi-Fi, end-user LANs, firewalls, WAN acceleration, application load-balancing and a wide array of supporting technologies. Ideal candidate to have expertise in networking technologies like routing and switching (especially in the Cisco Nexus platform), load-balancing, firewalls and wireless to assist in the higher levels of support responsibilities. What You Will Be Doing: Carries and maintains a relevant and up-to-date skill set in the areas of x86 hardware technology, Windows, Linux, RISC operating systems, P-Series hardware, SAN, NAS and data protection technologies. Must have a working knowledge of relevant WAN/LAN technologies, wireless infrastructure, DNS/DHCP, Load-Balancers, WAN Accelerators, and other network technologies. Implement and maintain technology observability and alerting solutions to provide real-time insights into system health, performance, and compliance. Establish and maintain service technology level objectives (SLOs) and service level indicators (SLIs) for critical enterprise services. Monitor and manage the performance and availability of enterprise applications, systems, and infrastructure, ensuring they meet or exceed established service level objectives (SLOs). Proactively identify, diagnose, troubleshoot, and resolve infrastructure, application, and IT operations issues in collaboration with other IT support teams. Develop, implement, and maintain automation tools and scripts to improve the efficiency and reliability of IT operations and infrastructure. Seasoned technologist who will identify technology and operational challenges in solutions and products offered by Architecture and Engineering teams as well as outside vendors and OEMs. In partnership and cooperation with the architecture and engineering teams ? ensures that products currently in ideation and development are being engineered with long-term operational sustainment goals in mind. Must have a solid understanding of Internet technologies and availability strategies for digital platforms. Must be familiar with complex network topics and availability approaches to drive performance from all network operations center functions. Solve modern network issues spanning across LAN, WAN, Datacenter and Cloud infrastructure. Provide technical leadership during major incidents and strive to quickly resolve complex network incidents. Accountable for developing and socializing technical solutions. Responsible for influencing the strategies of peer organizations ensuring that their strategic plans are in alignment with the technical direction of Network Operations and/or Strategy and Architecture. This position is a hands-on, highly involved position. The ideal person will have a solid background in various networking technologies and will require occasional work outside of normal business hours. Proficiency in one or more scripting languages (e.g., Python, Bash, Ruby) and automation tools (e.g., Python, PowerShell). Solid understanding of observability, monitoring and alerting tools (e.g., Splunk, New Relic, Grafana, ELK Stack, Datadog). Familiarity with modern operations support methodologies and practices, such as Site Reliability Engineering (SRE). Strong technology problem-solving and analytical skills, with the ability to quickly diagnose and resolve complex technical issues. Excellent communication and collaboration skills, with the ability to work effectively in cross-functional teams. Identify service improvement opportunities through trend analysis, proactive techniques, and after-action reviews. Analyze and publish operational utilization and service performance metrics regularly. Preferred Qualifications: Ability to use Ekahau and other similar products to occasionally perform active and passive surveys. Analyze the survey results and provide recommendations. Experience with Wireless design to identify design gaps and provide recommendations from an operational perspective. Experience managing Aruba Wireless infrastructure. Proficiency in Python and scripting tools to aid in automating routine network tasks. Familiarity with multi-cloud environment AWS, Azure and GCP and connectivity options to the cloud from Enterprise network. Outstanding communication skills are a must. Demonstrate ability to communicate effectively at a variety of levels and to communicate intricate technical and procedural matters to both technical and non-technical personnel across diverse cultures. Ability to be an effective team member and technical leader. Ability to recognize and embrace change as the external environment and organization evolves. Good understanding of how IT technology supports the Enterprise and the Business Segments Master’s degree in computer science or related field is a plus Technical Certifications, including Wireless certification, CCNP, CCIE, or Network DevOps certification are a plus Basic Qualifications: Current/active CCNA certification Minimum of 3+ years in either a large IT shared services organization or telecom service provider environment Expert level knowledge and understanding of Networking Protocols & Technologies such as: Ethernet, BGP, EIGRP, OSPF, Cisco HW and SW (IOS/IOS-XR/NX-OS), Palo Alto Firewalls, Load Balancers, Security protocols, MPLS, QoS and IP Multicast. Familiarity with hardware and transmission technologies (e.g., cabling, optical fiber, WAN, and transport, etc.) Expert-level understanding of Cisco wireless technologies and frequency bands. Familiarity with deploying Wireless controllers and handling the network using Cisco DNA Center. Education: Bachelor?s degree in Computer Science, Information Systems, Software, Electrical or Electronics Engineering, or comparable field of study, and/or equivalent work experience About The Walt Disney Company (Corporate): At Disney Corporate you can see how the businesses behind the Company?s powerful brands come together to create the most innovative, far-reaching, and admired entertainment company in the world. As a member of a corporate team, you?ll work with world-class leaders driving the strategies that keep The Walt Disney Company at the leading edge of entertainment. See and be seen by other innovative thinkers as you enable the greatest storytellers in the world to create memories for millions of families around the globe. About The Walt Disney Company: The Walt Disney Company, together with its subsidiaries and affiliates, is a leading diversified international family entertainment and media enterprise with the following business segments: Disney Entertainment, ESPN, Disney Parks, and Experiences and Products. From humble beginnings as a cartoon studio in the 1920s to its preeminent name in the entertainment industry today, Disney proudly continues its legacy of creating world-class stories and experiences for every member of the family. Disney?s stories, characters and experiences reach consumers and guests from every corner of the globe. With operations in more than 40 countries, our employees and cast members work together to create entertainment experiences that are both universally and locally cherished. This position is with Disney Worldwide Services, Inc., which is part of a business we call The Walt Disney Company (Corporate). Disney Worldwide Services, Inc. is an equal opportunity employer. Applicants will receive consideration for employment without regard to race, color, religion, sex, age, national origin, sexual orientation, gender identity, disability, protected veteran status or any other basis prohibited by federal, state or local law. Disney fosters a business culture where ideas and decisions from all people help us grow, innovate, create the best stories and be relevant in a rapidly changing world. #DISNEYTECH Branche annonce de poste : Enterprise Technology Activité principale de l?annonce de poste : End User & Technical Ops Catégorie de l?annonce d’emploi principale : Ingénieur fiabilité systèmes/sites Type d?emploi : Temps plein Ville principale, État, région, code postal : Lake Buena Vista, FL, USA Ville alternative, État, région, code postal : Date de publication : 2024-03-06 Learn more about us