Data Scientist and AIOps Developer/Administrator (Scientist 2/3)
Company: Los Alamos National Laboratory
Location: Los Alamos
Posted on: May 4, 2024
|
|
Job Description:
What You Will Do
The High Performance Computing (HPC) Division at Los Alamos
National Laboratory provides scientific computing resources
consisting of some of the largest HPC systems in the world as well
as numerous large commodity clusters. Our HPC Systems Group
(HPC-SYS) is creating a new AIOps team to make better use of our
vast amounts of HPC System and Application data, this team will add
AI solutions to our HPC infrastructure. The new team will work
alongside the other Teams in HPC-SYS, Monitoring, Web Services and
Cybersecurity. The Monitoring Team is responsible for collecting
all data from our Data Centers, everything from Facilities to
Clusters, and implementing operational dashboard, alerts and
reports using tools like Splunk. Our Web Servers team runs our
admin and user facing web sites, including user Documentation,
Ticketing systems and Gitlab. Our Cybersecurity Team monitors and
implements cybersecurity policies on our HPC systems.
The AIOps team will have three major focus areas, LLMs, System Data
Analysis and System Automation. We have a large set of HPC specific
documentation for both users and admins that will be integrated
into LLMs, this team will be responsible for designing, building,
and running the LLMs. We have massive amounts of system data. This
team will implement ML and Data Science techniques to perform
deeper analysis of the data to improve performance analysis, event
correlation and anomaly detection. Finally, the team will
investigate AI driven workflows for task automation within the Data
Center. You will work closely with other members of the AIOps team
and System Matter Experts (SMEs) in different HPC areas to design
and develop these tools using our on-prem analysis and Gen AI
systems.
The sucessful candidates' scope will include monitoring and
analyzing system performance to identify anomalies. You will
analyze large volumes of data to identify patterns and trends using
ML and Data Science techniques with the goal of developing
automation scripts and workflows to implement proactive measures.
You and the team will maintain the AIOps platforms and tools
including the user-facing LLMs. The successful candidate will
continue actively growing their technical skills and keeping up to
date with the latest technologies in the field. In addition, the
selected candidate will have the opportunity to develop technical
products such as technical documentation, presentations, technical
papers, and reports, to communicate findings internally and at
conferences. This position is full-time and is located at Los
Alamos National Laboratory in Los Alamos, New Mexico.
This position will be filled at either the Scientist 2 or Scientist
3 level, depending on the skills of the selected candidate.
Additional job responsibilities (outlined below) will be assigned
if the candidate is hired at the higher level.
What You Need
Minimum Job Requirements:
Scientist 2: ($101,700 - $168,200)
Knowledge of Linux system administration, including command line
Linux operating system skills, knowledge of hardware and software
security practices
Experience building, testing, evaluating and running LLMs
Knowledge of Machine Learning and Data Science techniques for data
analysis
Strong knowledge of Python and AI frameworks
Ability to clean, preprocess, and analyze large datasets
Knowledge of containerization technologies such as Docker and
Kubernetes
Additional Job Requirements for Scientist 3:
Scientist 3: ($122,300 - $206,300)
In addition to the Job Requirements outlined above, qualification
at the higher level requires:
Extensive experience analyzing system log and metric data with
strong statistical analysis skills and understanding of ML
algorithms
Experience with Machine Learning and Data Science techniques for
data analysis, anomaly detection and event correlation
Knowledge of anomaly detection techniques and time series
analysis
Knowledge of how to safeguard LLMs with guardrails and prompt
engineering
Experience fine-tuning Foundation Models
Education/Experience at lower level:
Position requires a Bachelor' degree in a STEM field from an
accredited college and university and 4 years of relevant
experience or an equivalent combination of education and experience
directly related to the occupation.
Education/Experience at higher level:
Position requires a Master's degree in a STEM field from an
accredited college or university and 6 years of relevant experience
or an equivalent combination of education and experience directly
related to the occupation.
Desired Qualifications:
Experience building and running RAG based LLMs
Experience with implementing AIOps tools and workflows from ML
Analysis to system automation and configuration management
Experience running on workflows on NVidia DGX/HGX systems or
pods
Experience using Git for version control
Experience integrating operational metrics into a monitoring system
such as Splunk
Familiarity with monitoring and logging tools like Syslog,
Telegraf, Prometheus, Grafana, etc.
Experience with deep learning frameworks such as TensorFlow or
PyTorch
Demonstrated effective communication skills, including demonstrated
ability to work productively with customers and vendors
High attention to detail including excellent organizational skills,
analytical thinking, observational and problem-solving skills.
Proven ability to independently multi-task and adjust to the
workings of a dynamic and fast paced environment.
An Active DOE Q Clearance
Work Location:
This position will be located in Los Alamos, NM, with the potential
for a hybrid work arrangement (60% onsite/40% offsite) from a
location within 2 hours ground commute of this location. Reporting
onsite will be required. Hybrid is at the discretion of management
and can change at any time with appropriate notice.
Position commitment: Regular appointment employees are required to
serve a period of continuous service in their current position in
order to be eligible to apply for posted jobs throughout the
Laboratory. If an employee has not served the time required, they
may only apply for Laboratory jobs with the documented approval of
their Division Leader. The position commitment for this position is
1 year.
Note to Applicants:
For consideration, applicants should submit a cover letter
addressing how their knowledge, skills and abilities meet the
minimum requirements along with a resume.
Where You Will Work
Located in beautiful northern New Mexico, Los Alamos National
Laboratory (LANL) is a multidisciplinary research institution
engaged in strategic science on behalf of national security. Our
generous benefits package includes:
- PPO or High Deductible medical insurance with the same large
nationwide network
- Dental and vision insurance
- Free basic life and disability insurance
- Paid childbirth and parental leave
- Award-winning 401(k) (6% matching plus 3.5% annually)
- Learning opportunities and tuition assistance
- Flexible schedules and time off (PTO and holidays)
- Onsite gyms and wellness programs
- Extensive relocation packages (outside a 50 mile radius)
Additional Details
Directive 206.2 - Employment with Triad requires a favorable
decision by NNSA indicating employee is suitable under NNSA
Supplemental Directive 206.2
(https://directives.nnsa.doe.gov/supplemental-directive/sd-0206-0002)
. Please note that this requirement applies only to citizens of the
United States. Foreign nationals are subject to a similar
requirement under DOE Order 142.3A.
Clearance: Q (Position will be cleared to this level). Selected
applicants will be subject to a background investigation conducted
by or on behalf of the Federal Government, and must meet
eligibility requirements* for access to classified matter. This
position requires a Q clearance. and obtaining such clearance
requires US Citizenship except in extremely rare circumstances.
Dependent upon the position, additional authorization to access
classified information may be required, which may or may not be
available to dual citizens. Receipt of a Q clearance and additional
access authorization ultimately is a decision of the Federal
Government and not of Triad.
*Eligibility requirements: To obtain a clearance, an individual
must be at least 18 years of age; U.S. citizenship is required
except in very limited circumstances. See DOE Order 472.2
(https://www.directives.doe.gov/directives-documents/400-series/0472.2-BOrder-chg1-pgchg)
for additional information.
New-Employment Drug Test: The Laboratory requires successful
applicants to complete a new-employment drug test and maintains a
substance abuse policy that includes random drug testing. Although
New Mexico and other states have legalized the use of marijuana,
use and possession of marijuana remain illegal under federal law. A
positive drug test for marijuana will result in termination of
employment, even if the use was pre-offer.
Regular position: Term status Laboratory employees applying for
regular-status positions are converted to regular status.
Internal Applicants: Regular appointment employees who have served
the required period of continuous service in their current position
are eligible to apply for posted jobs throughout the Laboratory. If
an employee has not served the required period of continuous
service, they may only apply for Laboratory jobs with the
documented approval of their Division Leader. Please refer to
Policy P701 for applicant eligibility requirements.
Equal Opportunity: Los Alamos National Laboratory is an equal
opportunity employer and supports a diverse and inclusive
workforce. All employment practices are based on qualification and
merit, without regard to race, color, national origin, ancestry,
religion, age, sex, gender identity, sexual orientation, marital
status or spousal affiliation, physical or mental disability,
medical conditions, pregnancy, status as a protected veteran,
genetic information, or citizenship within the limits imposed by
federal laws and regulations. The Laboratory is also committed to
making our workplace accessible to individuals with disabilities
and will provide reasonable accommodations, upon request, for
individuals to participate in the application and hiring process.
To request such an accommodation, please send an email to
applyhelp@lanl.gov or call 1-505-665-4444 option 1.
Keywords: Los Alamos National Laboratory, Santa Fe , Data Scientist and AIOps Developer/Administrator (Scientist 2/3), IT / Software / Systems , Los Alamos, New Mexico
Click
here to apply!
|