300 likes | 449 Views
OpenStack: The OpenSource Cloud’s Application in High Energy Physics. That Title’s Overstated. OpenStack: The OpenSource Cloud’s Potential Application in Data Intensive Research. Not as Catchy. Caveats. I am not a storage or network engineer I am not a scientist.
E N D
OpenStack: The OpenSource Cloud’s Application in High Energy Physics
That Title’s Overstated OpenStack: The OpenSource Cloud’s Potential Application in Data Intensive Research Not as Catchy...
Caveats • I am not a storage or network engineer • I am not a scientist • despite illusions of grandeur. I am: • a Technical Product Manager. • Dashboard Developer • working for piston{cloud}computing • Pragmatic.
What is openstack? • Founded by NASA and Rackspace • The open source cloud computing platform • Feature-rich and massively scalable • Powers cloud storage, compute, and networking • A world-wide open source collaboration
APPS OpenStack as a Cloud OS Self-service Portals for users Connects to apps via APIs USERS ADMINS CLOUD OPERATING SYSTEM Creates Pools of Resources Automates The Network
Benefits of OpenStack as a Common Platform • Easy to migrate data and applications across clouds Based on: • security policies • economics • research needs • No vendor lock-in • Common Layer of Data Exchange • Less exposed to security issues than public cloud, but still interoperable.
3 Major OpenStack Components • OpenStack Compute/Nova: provision and manage large networks of virtual machines • OpenStack Object Store/Swift: Create petabytes of reliable storage using standard servers • OpenStack Image Service/Glance: Catalog and manage large libraries of server images • + • Other components: Dashboard, Load Balancing, Authentication...
1. REST-based API 2. Horizontally and massively scalable 3. Hardware agnostic: supports a variety of standard commodity hardware. 4. Hypervisor Agnostic: support for Xen, Citrix XenServer, Microsoft Hyper-V, KVM, UML, LXC and ESX Compute/Nova Key Features
HOST 1 HOST 2 HOST 3 HOST 4, ETC. VMs Hypervisor: Turns 1 server into many “virtual machines” (instances or VMs) (VMWare ESX, Citrix XEN Server, KVM, Etc.) • Hypervisors provide abstraction layer between apps and hardware (SERVERS) • OpenStack pools servers, you run operating systems and applications on VMs instead of physical computers
Nova close up • nova-api daemon • endpoint for all OpenStack or EC2 API queries • nova-schedule process • takes a virtual machine instance request from the queue and determines which compute server host it should run on • a pluggable architecture allowing custom scheduling algorithm • nova-compute process • worker daemon that creates and terminates virtual machine instances
DevOp borrowed the rest for other machines Commodity Hardware • Piston Silicon Mechanics • 2 Intel Xeon processors 5600 Series • 96GB of DDR3 RAM • 24TB of SATA storage • Redundant 1200W power supplies • 2U rackmount chassis • That’s what our clients get, we’re on: • 32GB, 16TB, 2 Intel Xeon E5645 processors
Performance: 500 VM Spin Up • Assuming: • 500 copies of one 8GM image • Image warm on the nodes • 50 VMs/Server • Based on NASA’s experience in regular use, less than 30 seconds • Worst case: • Image is still in Glance • VM has to be copied via HTTP
Image Service/Glance 1. Store & retrieve VM images 2. REST-based API 3. Compatible with all common image formats 4. Storage agnostic: Store images locally, or use OpenStack Object Storage, HTTP, or S3
Storage/Swift Key Features 1. REST-based API 2. Data distributed evenly throughout system. 4. Scalable to multiple petabytes, billions of objects 3. Runs on commodity hardware 5. No central database required 6. Account/Container/Object structure (not file system, no nesting) plus Replication (N copies of accounts, containers, objects)
The Storage Story: Nova • Nova/Compute has it’s own storage • Block Storage or Nova-volume • an iSCSI solution • employs the use of Logical Volume Manager (LVM) for Linux • intended for read/write purposes (databases, log, etc.) • basically is an LVM/iSCSI implementation to mount block devices in VM.
The Storage Story: Swift • Swift: Object Storage • Fully Distributed • Commodity Hardware (Linux/x86) • Data Protection in Software • Not a File System • Not SAN/NAS/DAS... or any attached storage • Optimized for Scale - Petabytes
Swift in Production • Swift has been running in production at Rackspace for over a year with near 100% uptime. • Rackspace’s swift clusters store billions of objects and petabytes of data. • Internap, KT, SDSC, and HP are also running Swift in production
Sharing the Research Common software platform making Federation possible, through a shared API. Swift OS or EC2 API Location A Location B Private Cloud Private Cloud To federate Swift across locations, you write a scheduler within OpenStack and drive it through the API.
Swift Components Clients Proxy Servers Rings Account Servers Container Servers Object Servers
Swift Components • Proxy Server • Tie together the Swift architecture • Request routing • Exposes the public API
Swift Components • The Ring: Maps names to entities (accounts, containers, objects) on disk. • Stores data based on zones, devices, partitions, and replicas • Weights can be used to balance the distribution of partitions • Used by the Proxy Server for many background processes
Swift Components... • Object Server: • Blob storage server • metadata kept in xattrs • data in binary format • Object location based on name & timestamp hash
Swift & Large Object Storage • default 5GB limit on the size of an uploaded object • segmentation makes download size of a single object is virtually unlimited • segments large object are uploaded and a special manifest file is created • when downloaded, all segments are concatenated as a single object. • greater upload speed • possible parallel uploads of segments.
But Wait, Swift... • Doesn’t load balance for often requested objects. • throw Varnish Cache or Squid Proxy in front of Swift • Has a “simple” ReSTful API • Wasn't intended for storing unknown data • Isn’t searchable • Is like Amazon’s S3
Potential Solutions for Those Needing to Search Data • Or wait... • Swifts Blueprints Include Searchable MetaData • https://blueprints.launchpad.net/swift/+spec/future-searchable-metadata • Contribute to the greater community
What’s Piston Doing Different? • Piston Enterprise OS: • A hardened cloud operating system built on OpenStack™ • Optimized for secure and easy operation of enterprise private clouds • Fully supports interoperability with other OpenStack™ powered public and private cloud solutions.
{pentOS}TM features • {CloudKey}™ • Two-factor capable physical authentication • Minimizes security risk of administrative logins • Hands-free install in under 5 minutes • Null-Tier [Architecture]™ • Storage, compute and networking on every node • Massively scalable • Automated scaling
{CloudKey}™ • Server<1> • Networking • Storage • Compute • Management • Server<N> • Networking • Storage • Compute • Management Highly available Virtual Storage Highly available Virtual Machines Highly available {pentOS}controllers Hands-Free OS Install and Configuration {pentOS}TMNull-Tier [Architecture]™ Top of Rack Switch …
Contact • Neil Johnston • email: neil@pistoncloud.com • twitter: @neiljohnston • Or my co-authors: • Joshua McKenty • email: josh@pistoncloud.com • Christopher MacGown • email: chris@pistoncloud.com