DescriptionAre you interested in delivering large-scale, high performance, fault tolerant solutions? Oracle’s Cloud Infrastructure team is building a next generation Infrastructure-as-a-Service that supports the most demanding mission-critical customer requirements, and operate at cloud scale to provide a secure, distributed multi-tenant cloud environment.
We're looking for hands-on engineers with a passion for solving difficult problems in distributed systems, virtualized infrastructure, and highly available services. Joining Oracle will give you the opportunity to design and build innovative new systems from the ground up and operate services at scale. Our engineers have significant technical and business impact while delivering critical enterprise level features.
Job Description
As a Principal Member of Technical Staff, you will work with senior architects and product management to define requirements for OCI’s upcoming AI/ML storage infrastructure services. You have deep experience with Lustre parallel filesystems operating in large scale Linux environments. You ideally possess a working understanding of the Lustre architecture and codebase and have used your knowledge to troubleshoot issues, modify code or contribute improvements back to the Lustre git tree. Expertise in one or more Public Cloud offerings is a plus. You will be expected to make substantial contributions towards our design and architecture and will implement proof of concepts. You have excellent communication skills and can clearly explain complex technical concepts. As a technical leader on your team, you will mentor and demonstrate core values for other more junior engineers. You will write code, review code written by your peers, and write test automations. You should value simplicity and scale, work comfortably in a collaborative, agile environment, and be excited to learn.
Career Level - IC4
ResponsibilitiesQualifications
- 6+ years experience delivering and operating large scale, highly available distributed systems.
- Substantial system administration or code-level experience with Lustre filesystems operating in large scale Linux environments.
- Strong proficiency with C and C++. Python and/or Java is a plus.
- Expertise in one or more Public Cloud offerings (OCI, AWS, GCP, Azure) is a plus.
- Experience with other high-throughput I/O architectures like DAOS/SPDK is a strong plus.
- Background in RMDA and high-performance networking (SmartNICs, NVMe/TCP, RoCEv2) is a plus.
- Familiarity with AI/ML frameworks (Tensorflow/Keras, PyTorch, Scikit-Learn, XGBoost, Caffe) as well as MLOps and Kubernetes is a plus.
- Strong knowledge of data structures, algorithms, operating systems, and distributed systems fundamentals.
- Strong troubleshooting and performance tuning skills.
- Self-motivation to thrive in a fast-paced environment.
- Bachelors or Masters in Computer Science, Computer Engineering, or related field.
QualificationsDisclaimer:
Certain US customer or client-facing roles may be required to comply with applicable requirements, such as immunization and occupational health mandates.Range and benefit information provided in this posting are specific to the stated locations onlyUS: Hiring Range: from $94,200 to $223,500 per annum. May be eligible for bonus and equity.
Oracle maintains broad salary ranges for its roles in order to account for variations in knowledge, skills, experience, market conditions and locations, as well as reflect Oracle’s differing products, industries and lines of business.
Candidates are typically placed into the range based on the preceding factors as well as internal peer equity.
Oracle US offers a comprehensive benefits package which includes the following:
1. Medical, dental, and vision insurance, including expert medical opinion
2. Short term disability and long term disability
3. Life insurance and AD&D
4. Supplemental life insurance (Employee/Spouse/Child)
5. Health care and dependent care Flexible Spending Accounts
6. Pre-tax commuter and parking benefits
7. 401(k) Savings and Investment Plan with company match
8. Paid time off: Flexible Vacation is provided to all eligible employees assigned to a salaried (non-overtime eligible) position. Accrued Vacation is provided to all other employees eligible for vacation benefits. For employees working at least 35 hours per week, the vacation accrual rate is 13 days annually for the first three years of employment and 18 days annually for subsequent years of employment. Vacation accrual is prorated for employees working between 20 and 34 hours per week. Employees working fewer than 20 hours per week are not eligible for vacation.
9. 11 paid holidays
10. Paid sick leave: 72 hours of paid sick leave upon date of hire. Refreshes each calendar year. Unused balance will carry over each year up to a maximum cap of 112 hours.
11. Paid parental leave
12. Adoption assistance
13. Employee Stock Purchase Plan
14. Financial planning and group legal
15. Voluntary benefits including auto, homeowner and pet insurance
The role will generally accept applications for at least three calendar days from the posting date or as long as the job remains posted.