← Serch more jobs

IT Systems Supervisor

Plymouth Rock Assurance • Boston, MA

Not Applicable Posted April 17, 2026 Job link

Thinking about this job

Responsibilities

Ensure the reliability, performance, and availability of all compute infrastructure and M365 services — on-prem servers, virtual machines, cloud instances, Exchange Online, Teams, SharePoint, and remote access services
Own regular maintenance windows for patching, upgrades, and housekeeping across Windows, Linux, AWS compute, and M365 workloads
Establish and enforce operational standards, runbooks, and procedures that create consistency and reduce dependency on tribal knowledge
Monitor compute and M365 environment health continuously; act on signals before they become incidents
Manage compute capacity across on-prem and AWS — ensuring right-sized, available resources that meet demand without unnecessary cost M365 Administration
Own firm-wide M365 administration including Exchange Online, Teams, SharePoint, OneDrive, Entra ID, and associated services
Manage tenant configuration, licensing, service health, and policy governance across the M365 platform
Partner with the business to understand collaboration needs and translate them into well-governed M365 solutions
Maintain a clear operational model for M365 — covering administration, support escalation, and change management
Stay current on the M365 roadmap; evaluate new capabilities and drive adoption where they deliver business value Incident Management
Lead incident response for P1/P2 compute and M365 events; coordinate across teams and drive to resolution with urgency and clarity
Establish and maintain incident runbooks, post-mortems, and lessons-learned processes
Track incident trends and use data to drive systemic fixes and reduce repeat events
Ensure clear on-call coverage, escalation paths, and communication protocols are in place and understood by the team Vulnerability & Security Operations
Own the vulnerability management lifecycle across compute: scanning, prioritization, remediation tracking, and reporting
Partner with the Infosec team to align operational processes with security policies, ensuring consistent enforcement across compute and M365
Serve as the operational bridge between infrastructure and security — translating policy requirements into executable team workflows
Ensure timely closure of critical and high findings, with clear escalation paths for exceptions and risk acceptance Observability Platform
Define, deliver, and own the firm's observability platform for compute and M365 — spanning open source and commercial tooling
Architect a unified view of environment health across on-prem, AWS, and M365 (metrics, logs, traces, and events)
Establish proactive alerting, dashboards, and runbooks to reduce MTTR
Drive adoption of the platform across Windows and Linux teams, ensuring consistent coverage and actionable signal Automation & Tooling
Champion an automation-first culture; eliminate manual, repetitive operational tasks through scripting and orchestration (Ansible, PowerShell, Bash, Terraform)
Drive infrastructure-as-code adoption for compute provisioning and configuration across on-prem and AWS
Leverage M365 automation capabilities — Power Automate, Graph API, and PowerShell — to streamline administration and reduce manual effort
Identify and implement tooling to reduce toil and accelerate delivery Technology Evolution & Technical Debt
Maintain a living inventory of technical debt across all compute and M365 ownership areas — server platforms, operating systems, virtualization, messaging, collaboration, and remote access
Develop and own multi-horizon technology roadmaps that balance operational stability with modernization
Make the case for investment: translate technical debt and risk into business impact for leadership
Establish a cadence of review and retirement — ensuring aging technologies are actively replaced, not just maintained
Champion forward-looking decisions on platform lifecycle, vendor strategy, and architectural direction Cross-Functional Partnership & Technology Strategy
Partner with the AppDev team to ensure prompt, reliable delivery of compute services — with clear cost accountability and service-level expectations on both sides
Collaborate with FinOps to drive compute and M365 cost optimization, ensuring spend is visible, justified, and continuously improved
Partner with Architecture to create, update, and execute against technology roadmaps that align compute and M365 direction with firm-wide strategy
Continuously evaluate emerging technologies to reduce risk, lower costs, and improve the reliability, scalability, maintainability, and security of the environment
Represent compute and M365 capabilities and constraints in cross-functional planning forums, ensuring operational realities inform strategic decisions Team Leadership & Workload Management
Lead and mentor a team of Windows and Linux admins; bridge the gap between both disciplines
Manage operational queue: balance incident response, project work, and proactive improvements
Drive accountability through sprint planning and ticket hygiene with clear escalation paths
Conduct regular 1:1s, performance reviews, and career development conversations Project & Change Accountability
Own end-to-end delivery of compute and M365 projects — on scope, on time, with documented outcomes
Manage change control processes; reduce risk through peer review and staged rollouts
Communicate status, risks, and blockers to leadership proactively Technology Environment:
Messaging & Collaboration: M365, Exchange Online, Teams, SharePoint, OneDrive
Legacy Messaging: On-Prem Exchange
Compute Infrastructure: Windows Server, Linux, VMware ESX
Cloud Compute (AWS): EC2, Auto Scaling Groups, WorkSpaces
Remote Access & VDI: Citrix, Ivanti Secure
Technical debt is inventoried, risk-rated, and tied to a remediation plan
Vulnerability backlog is tracked, prioritized, and trending down
Team has clear project ownership with no items lost in the queue
Capacity reports delivered monthly with forward-looking recommendations

Not Met Priorities

What still needs stronger evidence

Requirements

Preferred Skills

5-7 years managing Windows Server and Linux compute infrastructure in enterprise environments
Proven people leadership experience with mixed-skill technical teams
Deep hands-on experience with M365 administration at the enterprise level — Exchange Online, Teams, SharePoint, Entra ID, and tenant governance
Experience managing the transition from or coexistence with on-premises Exchange
Expertise with VMware ESX/vSphere virtualization platforms
AWS compute experience including EC2, Auto Scaling Groups, and WorkSpaces
Experience with VDI/remote access platforms (Citrix, Ivanti Secure)
Demonstrated experience defining and delivering an observability or monitoring platform
Experience owning vulnerability management and patch compliance programs across compute
Track record of developing technology roadmaps and systematically retiring technical debt
Experience partnering with FinOps, AppDev, or Architecture teams in a cross-functional capacity
Strong scripting and automation skills (PowerShell, Graph API, Bash, or equivalent) Preferred
Experience with infrastructure-as-code tools (Terraform, Ansible, CloudFormation)
Hands-on experience with open source observability tools (e.g.
Prometheus, Grafana, OpenTelemetry, ELK)
Familiarity with ITSM and workload management platforms
AWS Solutions Architect or equivalent cloud certification
Microsoft 365 certification (MS-102 or equivalent)

Education

(Required) – Bachelor's degree in Computer Information Systems, Computer Science, Information Technology, or a related field; equivalent experience will be considered Required

IT Systems Supervisor We are looking for a change agent — not a caretaker. This role demands an automation-first mindset, strong technical depth across Windows and Linux environments, and the leadership presence to raise the bar for how our infrastructure team operates. You will own the health, reliability, and evolution of the firm's compute and M365 environment while developing a high-performing team. Essential Functions and Responsibilities: Compute & M365 Stability and Availability
Ensure the reliability, performance, and availability of all compute infrastructure and M365 services — on-prem servers, virtual machines, cloud instances, Exchange Online, Teams, SharePoint, and remote access services
Own regular maintenance windows for patching, upgrades, and housekeeping across Windows, Linux, AWS compute, and M365 workloads
Establish and enforce operational standards, runbooks, and procedures that create consistency and reduce dependency on tribal knowledge
Monitor compute and M365 environment health continuously; act on signals before they become incidents
Manage compute capacity across on-prem and AWS — ensuring right-sized, available resources that meet demand without unnecessary cost M365 Administration
Own firm-wide M365 administration including Exchange Online, Teams, SharePoint, OneDrive, Entra ID, and associated services
Manage tenant configuration, licensing, service health, and policy governance across the M365 platform
Partner with the business to understand collaboration needs and translate them into well-governed M365 solutions
Maintain a clear operational model for M365 — covering administration, support escalation, and change management
Stay current on the M365 roadmap; evaluate new capabilities and drive adoption where they deliver business value Incident Management
Lead incident response for P1/P2 compute and M365 events; coordinate across teams and drive to resolution with urgency and clarity
Establish and maintain incident runbooks, post-mortems, and lessons-learned processes
Track incident trends and use data to drive systemic fixes and reduce repeat events
Ensure clear on-call coverage, escalation paths, and communication protocols are in place and understood by the team Vulnerability & Security Operations
Own the vulnerability management lifecycle across compute: scanning, prioritization, remediation tracking, and reporting
Partner with the Infosec team to align operational processes with security policies, ensuring consistent enforcement across compute and M365
Serve as the operational bridge between infrastructure and security — translating policy requirements into executable team workflows
Ensure timely closure of critical and high findings, with clear escalation paths for exceptions and risk acceptance Observability Platform
Define, deliver, and own the firm's observability platform for compute and M365 — spanning open source and commercial tooling
Architect a unified view of environment health across on-prem, AWS, and M365 (metrics, logs, traces, and events)
Establish proactive alerting, dashboards, and runbooks to reduce MTTR
Drive adoption of the platform across Windows and Linux teams, ensuring consistent coverage and actionable signal Automation & Tooling
Champion an automation-first culture; eliminate manual, repetitive operational tasks through scripting and orchestration (Ansible, PowerShell, Bash, Terraform)
Drive infrastructure-as-code adoption for compute provisioning and configuration across on-prem and AWS
Leverage M365 automation capabilities — Power Automate, Graph API, and PowerShell — to streamline administration and reduce manual effort
Identify and implement tooling to reduce toil and accelerate delivery Technology Evolution & Technical Debt
Maintain a living inventory of technical debt across all compute and M365 ownership areas — server platforms, operating systems, virtualization, messaging, collaboration, and remote access
Develop and own multi-horizon technology roadmaps that balance operational stability with modernization
Make the case for investment: translate technical debt and risk into business impact for leadership
Establish a cadence of review and retirement — ensuring aging technologies are actively replaced, not just maintained
Champion forward-looking decisions on platform lifecycle, vendor strategy, and architectural direction Cross-Functional Partnership & Technology Strategy
Partner with the AppDev team to ensure prompt, reliable delivery of compute services — with clear cost accountability and service-level expectations on both sides
Collaborate with FinOps to drive compute and M365 cost optimization, ensuring spend is visible, justified, and continuously improved
Partner with Architecture to create, update, and execute against technology roadmaps that align compute and M365 direction with firm-wide strategy
Continuously evaluate emerging technologies to reduce risk, lower costs, and improve the reliability, scalability, maintainability, and security of the environment
Represent compute and M365 capabilities and constraints in cross-functional planning forums, ensuring operational realities inform strategic decisions Team Leadership & Workload Management
Lead and mentor a team of Windows and Linux admins; bridge the gap between both disciplines
Manage operational queue: balance incident response, project work, and proactive improvements
Drive accountability through sprint planning and ticket hygiene with clear escalation paths
Conduct regular 1:1s, performance reviews, and career development conversations Project & Change Accountability
Own end-to-end delivery of compute and M365 projects — on scope, on time, with documented outcomes
Manage change control processes; reduce risk through peer review and staged rollouts
Communicate status, risks, and blockers to leadership proactively Technology Environment:
Messaging & Collaboration: M365, Exchange Online, Teams, SharePoint, OneDrive
Legacy Messaging: On-Prem Exchange
Compute Infrastructure: Windows Server, Linux, VMware ESX
Cloud Compute (AWS): EC2, Auto Scaling Groups, WorkSpaces
Remote Access & VDI: Citrix, Ivanti Secure
Identity: Entra ID What success looks like:
Compute and M365 stability and availability metrics are defined, baselined, and improving
M365 tenant is well-governed with clear ownership, licensing discipline, and a roadmap for new capability adoption
Observability platform is defined, roadmapped, and in active delivery within 60 days
Technology roadmaps exist for all compute and M365 ownership areas within 90 days
Technical debt is inventoried, risk-rated, and tied to a remediation plan
Cross-functional partnerships with AppDev, FinOps, and Architecture are active and productive
Compute and M365 spend is visible, optimized, and trending in the right direction
Operational incidents decrease quarter over quarter due to proactive monitoring
Manual tasks are systematically identified and automated within 90 days
Vulnerability backlog is tracked, prioritized, and trending down
Team has clear project ownership with no items lost in the queue
Capacity reports delivered monthly with forward-looking recommendations
Team members report clarity on priorities and feel supported to grow Qualifications and Education:
Bachelor's degree in Computer Information Systems, Computer Science, Information Technology, or a related field; equivalent experience will be considered Required
5-7 years managing Windows Server and Linux compute infrastructure in enterprise environments
Proven people leadership experience with mixed-skill technical teams
Deep hands-on experience with M365 administration at the enterprise level — Exchange Online, Teams, SharePoint, Entra ID, and tenant governance
Experience managing the transition from or coexistence with on-premises Exchange
Expertise with VMware ESX/vSphere virtualization platforms
AWS compute experience including EC2, Auto Scaling Groups, and WorkSpaces
Experience with VDI/remote access platforms (Citrix, Ivanti Secure)
Demonstrated experience defining and delivering an observability or monitoring platform
Experience owning vulnerability management and patch compliance programs across compute
Track record of developing technology roadmaps and systematically retiring technical debt
Experience partnering with FinOps, AppDev, or Architecture teams in a cross-functional capacity
Strong scripting and automation skills (PowerShell, Graph API, Bash, or equivalent) Preferred
Experience with infrastructure-as-code tools (Terraform, Ansible, CloudFormation)
Hands-on experience with open source observability tools (e.g. Prometheus, Grafana, OpenTelemetry, ELK)
Familiarity with ITSM and workload management platforms
AWS Solutions Architect or equivalent cloud certification
Microsoft 365 certification (MS-102 or equivalent)
Background in compute capacity planning and performance tuning at scale Salary Range: $125,000-$179,500 per year. Actual compensation will vary based on multiple factors, including employee knowledge and experience, role scope, business needs, geographical location, and internal equity. About the Company: The Plymouth Rock Company and its affiliated group of companies write and manage over $2.2 billion in personal and commercial auto and homeowner’s insurance throughout the Northeast and mid-Atlantic, where we have built an unparalleled reputation for service. We continuously invest in technology, our employees thrive in our empowering environment, and our customers are among the most loyal in the industry. The Plymouth Rock group of companies employs more than 2,000 people and is headquartered in Boston, Massachusetts. Plymouth Rock Assurance Corporation holds an A.M. Best rating of “A-/Excellent”. #LI-DNI