770 likes | 798 Views
Learn how to deploy Hybris on Windows Azure for cloud data hosting, using Azure Blob Storage to store media files and attachments. Understand the architecture, utilize Cloud Windows Azure services, and manage your data efficiently.
E N D
Hybris – cloud - bigdata V1.0 19/11/2014 Yassine MEJRI
Hybris-cloud-bigdata Cloud Windows Azure Deploying Hybris on Windows Azure Elasticsearch Kibana Use cases : Analytics, Machine learning. Agenda
CLOUD Cloud Computing A standardised IT capability (services, software or infrastructure) delivered via internet technologies in a pay-per-use, self-service way Cloudservices are shared services, under virtualised management, accessible over the internet A style of computing where massively scalable IT-related capabilities are provided “as a service” using internet technologies to multiple external customers
CLOUD History 1960 : John McCarthy’s Concept “Computation may someday be organized as a public utility." “Pioneered the concept of delivering enterprise applications via a simple website” 1999 : Salesforce.com 2000 : Microsoft 2001 : IBM “Expanded Sass Concept through web service” 2005 : Amazon “Launch of Amazon web services” 2007 : Google and IBM “Start researching Cloud Computing” 2008 : Gartner Research “Start using Cloud Computing in many organization”
Cloud http://www.cloudscreener.com/ Cloud computing providers
CLOUD WINDOWS AZURE
WINDOWS AZURE WINDOWS AZURE LAYERS
WINDOWS AZURE Cloud service model
Windows azure Geo-location Datacenter US Europe Asia South Central US North Central US Western Europe South East Asia West US East US Northern Europe East Asia
WINDOWS AZURE Building and running apps
WINDOWS AZURE Building and running apps
Windows azure blob storage Azure Blob storage is a service for storing large amounts of unstructured data, such as text or binary data, that can be accessed from anywhere in the world via HTTP or HTTPS. Common uses of Blob storage include: Serving images or documents directly to a browser Storing files for distributed access Streaming video and audio Performing secure backup and disaster recovery Architecture Services: PutBlob, GetBlob, DeleteBlob, CopyBlob, SnapshotBlob, LeaseBlob…
Windows azure blob storage Connexion String : publicstaticfinalString storageConnectionString ="DefaultEndpointsProtocol=http;"+"AccountName=your_storage_account;"+"AccountKey=your_storage_account_key"; Create container : CloudStorageAccount storageAccount =CloudStorageAccount.parse(storageConnectionString); CloudBlobClient blobClient = storageAccount.createCloudBlobClient(); CloudBlobContainer container = blobClient.getContainerReference("images"); container.createIfNotExists(); Java API
Windows azure blob storage Change permissions : BlobContainerPermissions containerPermissions =newBlobContainerPermissions(); containerPermissions.setPublicAccess(BlobContainerPublicAccessType.CONTAINER); container.uploadPermissions(containerPermissions); Upload blob : finalString filePath ="C:\\myimages\\myimage.jpg"; CloudBlockBlob blob = container.getBlockBlobReference("myimage.jpg"); File source =newFile(filePath); blob.upload(newFileInputStream(source), source.length()); Download blob : for(ListBlobItem blobItem : container.listBlobs()){ if(blobItem instanceofCloudBlob){ CloudBlob blob =(CloudBlob) blobItem; blob.download(newFileOutputStream("C:\\mydownloads\\"+ blob.getName())); } } Java API
Windows azure blob storage Tables NoSQL http://<account>.table.core.windows.net/<table> Services: Insert, Update, Delete, Query, Entity Group Transaction…
Windows azure blob storage Queue http://<account>.queue.core.windows.net/<queue>/messages Services: Put, Get, Peek, Delete, Update…
CLOUD Windows Azure Management Console
cloud Windows azure SDK : Import-AzurePublishSettingsFile -PublishSettingsFile "full path to downloaded file“ New-AzureAffinityGroup -Name pslab-group -Location "East US“ New-AzureQuickVM -ImageName $VMImage -Windows -Name $myVMName -ServiceName $myVMName -AdminUsername $myAdminName -Password $myAdminPwd -AffinityGroup pslab-grou Stop-AzureVM -Name $myVMName -ServiceName $myVMName Start-AzureVM -Name $myVMName -ServiceName $myVMName Restart-AzureVM -Name $myVMName -ServiceName $myVMName Azure SDK : Powershell, Node.js, Java …
HYBRIS Use Case : Deploying Hybris on Windows Azure Deploy Hybris
Deployhybris Architecture : auto-scalable horizontal and vertical VIP : windows Azure Load Balancer (Failover, Round Robin, Performance) CDN HTTP/HTTPS Ni N2 N1 N1 N2 Ni Cloud Service F.O Cloud Service B.O AZURE SQL SERVER Azure Blob Storage : Medias, Files, Attachements, orders.pdf…
Hybris-CLOUD Windows Azure Blob provides a simple web services interface that can be used to store and retrieve any amount of data. You can configure a specific MediaFolder to store binary data of a Media item directly in Windows Azure Blob. To configure your folder to use Windows Azure Blob you need to have: Windows Azure account Properly created Access Keys For more details read http://www.windowsazure.com/en-us/develop/net/how-to-guides/blob-storage/. Azure cloud Extension
Hybris-cloud Azure cloud Extension
Hybris-CLOUD https://wiki.hybris.com/display/release5/Using+Windows+Azure+Blob+Media+Storage+Strategy Import extension : azurecloud Configure blob storage in local.properties: Global settings : media.globalSettings.accountKey= media.globalSettings.accountName= media.globalSettings.connection=UseDevelopmentStorage\=True media.globalSettings.endPointProtocol=http media.globalSettings.local.cache=true media.globalSettings.public.base.url=http://127.0.0.1:10000/devstoreaccount1 media.globalSettings.secured=true media.globalSettings.storage.strategy=windowsAzureBlobStorageStrategy media.globalSettings.url.strategy=windowsAzureBlobURLStrategy Azure cloud Extension
Hybris-CLOUD 3. How to create new blob storagefolder : …….. media.folder.invoices.accountKey= media.folder.invoices.accountName= media.folder.invoices.connection=UseDevelopmentStorage\=True media.folder.invoices.endPointProtocol=http media.folder.invoices.local.cache=true media.folder.invoices.public.base.url=http://127.0.0.1:10000/devstoreaccount1 media.folder.invoices.secured=true media.folder.invoices.storage.strategy=windowsAzureBlobStorageStrategy media.folder.invoices.url.strategy=windowsAzureBlobURLStrategy …….. Azure cloud Extension
Hybris-CLOUD 4. StoringMedia Files : finalMediaModel media =modelService.create(MediaModel.class);media.setCatalogVersion(catalogVersionService.getCatalogVersion("productCatalog","Staged")); finalMediaFolderModelfolder=mediaService.getFolder("invoices"); media.setFolder(folder); mediaService.save(media); Azure cloud Extension
Hybris-cloud Secure media access
Hybris-cloud You can enable secure media access for specific Media folder by putting in your local.properties file the following property set to true: media.folder.<folderName>.secured=true It means that only secure URL will be rendered for each Media item stored in these folders. It also means that access to these medias will be filtered only by the SecureMediaFilter. ManagingPermissions: Use the MediaPermissionService Using hMC You can grant or deny access to a Media item for a give principal by opening specific Media item and going to Security tab. Using ImpEx Below you can find the example of an ImpEx import script for granting access to a Media item with code 1017895.jpg for the editor principal: INSERT_UPDATE media; code[unique=true]; catalogVersion(catalog(id),version)[unique=true]; permittedPrincipals(uid);;1017895.jpg; clothescatalog:Staged;editor; Secure media access
Hybris-CLOUD http://hybrisazure.blob.core.windows.net/hybris/sys_master/root/h3e/hd7/8796157378590.jpg Initialze or Update Hybris : Keep in mind that even if name of custom container is myContainer, then prefix with tenantId is added automatically, so finally container name is sys-master-myContainer. The pattern is sys-<tenantID>-<containerName>. To control cleaning Windows Azure storage on fresh initialization use following global property: media.globalSettings.windowsAzureBlobStorageStrategy.cleanOnInit={true or false} Azure cloud Extension
Deployhybris Azure Cloud Service ? VIP : windows Azure Load Balancer (Failover, Round Robin, Performance) CDN HTTP/HTTPS Ni N2 N1 N1 N2 Ni Cloud Service F.O Cloud Service B.O AZURE SQL SERVER Azure Blob Storage : Medias, Files, Attachements, orders.pdf…
Deployhybris AzureRunMe
Deployhybris Windows Azure Services are described by two important artifacts: Service Definition (*.csdef) Service Configuration (*.cscfg) Your code is zipped and packaged with definition (*.cspkg) Encrypted(Zipped(Code + *.csdef)) == *.cspkg Windows Azure consumes just (*.cspkg + *.cscfg) Packaging and Deploy Hybris
Deployhybris # import Azure dll $env:PSModulePath=$env:PSModulePath+";"+"C:\Program Files (x86)\Microsoft SDKs\Windows Azure\PowerShell Import-Module Azure # Connexion Import-AzurePublishSettingsFile$pubsettings Select-AzureSubscription-SubscriptionName $selectedsubscription Set-AzureSubscription-CurrentStorageAccount $storageAccountName-SubscriptionName $selectedsubscription # Create New deployement $opstat=New-AzureDeployment-Slot $slot-Package $packageLocation-Configuration $cloudConfigLocation-label $deploymentLabel-ServiceName $serviceName # Upgrade deployement $setdeployment=Set-AzureDeployment-Upgrade -Slot $slot-Package $packageLocation-Configuration $cloudConfigLocation-label $deploymentLabel-ServiceName $serviceName-Force # swap deployment, staging production Move-AzureDeployment -ServiceName $serviceName Devops : Azure PowerShell cmdlets
Hybris-CLOUD Demo : AzureRunMe and Windows Azure Emulator AzureRunMe
Elasticsearch Elasticsearch
ElasticSearch https://github.com/elasticsearch Java Apache Lucene Plug and play Document Oriented Scalable Clustering Lucene Sharding and replication REST/ JSON Client Apache2 license Elasticsearch
Elasicsearch SQL VS ES
Elasticsearch Architecture
ElasticSearch Core types : String, Integer, Long , Double, Boolean, Date, Binary …. IP type : "address" : { "type" : "ip", "store" : "yes" } { "name" : "Tom PC", "address" : "192.168.2.123" } Geo point type : "location" : { "type" : "geo_point"} Attachement type : "my_attachment" : { "type" : "attachment" } Token count type : The token_count field type allows us to store index information about how many words the given field has instead of storing and indexing the text provided to the field. "address_count" : { "type" : "token_count", "store" : "yes" } Mapping fields types
ElasticSearch Object types : JSON documents are hierarchical in nature, allowing them to define inner "objects" within the actual JSON. "tweet" : { "properties" : { "person" : { "type" : "object", "properties" : { "name" : {"type" : "object", "properties" : { "first_name" : {"type" : "string"}, "last_name" : {"type" : "string"} } }, "sid" : {"type" : "string", "index" : "not_analyzed"} } }, "message" : {"type" : "string"}} } Mapping fields types
ElasticSearch Nested Types : The nested type works like the object type except that an array of objects is flattened, while an array of nested objects allows each object to be queried independently. To explain, consider this document: Mapping : { "type1" : { "properties" : { "users" : { "type" : "nested", "properties": { "first" : {"type": "string" }, "last" : {"type": "string" } } } } }} Mapping fields types
ElasticSearch Array types : JSON documents allow to define an array (list) of fields or objects. "Product" : [ { "id" : 12 "title" : "iphone", "categories" : [1,3,5,7], "tag" : ["iphone4", "iphone5","iphone6"], "author" : [ { "firstname" : "Francois", "lastname": "francoisg", "id" : 18 }, { "firstname" : "Gregory", "lastname" : "gregquat" "id" : "2" } ]}} Mapping fields types
ELASTICsearch Relationnel vs denormalize
ELASTICsearch "translation": { "_routing" : { "required" : true, "path" : "project_id" }, "_id" : { "path" : "id" }, "_all" : { "enabled" : "false" }, "dynamic" : "strict", "properties" : { "id" : { "type" : "string", "index" : "not_analyzed" }, "public_id" : { "type" : "integer", "index" : "not_analyzed" }, "project_id" : { "type" : "string", "index" : "not_analyzed" }, "title_na" : { "type" : "string", "index" : "not_analyzed" }, "title" : { "type" : "string", "index" : "analyzed", "analyzer" : "trans_standard" }, "title_cs" : { "type" : "string", "index" : "analyzed", "analyzer" : "trans_cs" }, "description" : { "type" : "string", "index" : "analyzed", "analyzer" : "trans_standard" }, "description_cs" : { "type" : "string", "index" : "analyzed", "analyzer" : "trans_cs" }, "resource_file_id" : { "type" : "integer", "index" : "not_analyzed" }, "created_at" : { "type" : "long", "index" : "not_analyzed" }, "updated_at" : { "type" : "long", "index" : "not_analyzed" }, "any_empty" : { "type" : "boolean", "index" : "not_analyzed" }, "all_empty" : { "type" : "boolean", "index" : "not_analyzed" }, "status" : { "type" : "string", "index" : "not_analyzed" }, "phrases" : { "_id" : { "path" : "id" }, "type" : "nested", "properties" : { "id" : { "type" : "string", "index" : "not_analyzed" }, "iso2_lang" : { "type" : "string", "index" : "not_analyzed" }, "content" : { "type" : "string", "index" : "analyzed", "analyzer" : "trans_standard" }, "content_cs" : { "type" : "string", "index" : "analyzed", "analyzer" : "trans_cs" }, "created_at" : { "type" : "long", "index" : "not_analyzed" }, "updated_at" : { "type" : "long", "index" : "not_analyzed" }, "status" : { "type" : "string", "index" : "not_analyzed" } } } } } Relationnel vs denormalize
ELASTICSEARCH Insert Data: $ cat data.json { "index" : { "_index" : "requests" , "_type" : "request" , "_id" : 33 } } { "client" : "client1" , "country" : "FR" , "id" : 1, "ip" : "100.1.1.3", "password" : "test" , "sensor" : "test" , "session" : "EFRFR34344" , "success" : "OK" ,"timestamp" : "1414183085848", "username" : "test" } $ curl -XPOST http://localhost:9200/requests -d @data.json Update : $curl -XPOST 'localhost:9200/test/type1/1/_update'-d '{"doc":{"name":"new_name"}}'}}‘ Delete : $ curl -XDELETE 'http://localhost:9200/twitter/tweet/1‘ Elasticsearch : CRUD
Elasticsearch $ curl -XPOST http://localhost:9200/_search?<YOUR_QUERY> Query DSL
Elasticsearch 'http://localhost:9200/requests/_search?pretty' -d '{ "query": { "filtered": { "query": { "bool": { "should": [ { "query_string": { "query": "marketing.cars >100" } }, { "query_string": { "query": "marketing.music > 100" } }, { "query_string": { "query": "marketing.electronics > 00" } }, { "query_string": { "query": "marketing.fashion > 100" } } ] } }, "filter": { "bool": { "must": [ { "match_all": {} }, { "exists": { "field": "location" } } ] } } } }, "fields": [ "location", "remoteAddr" ], "size": 1000 }' Query DSL
ElasticSearch SearchRequestBuilder requestOne = node.client().prepareSearch().setQuery(QueryBuilders.matchQuery("name","test1")).setSize(1); SearchRequestBuilder requestTwo = node.client().prepareSearch().setQuery(QueryBuilders.matchQuery("name","test2")).setSize(1);MultiSearchResponse response = node.client().prepareMultiSearch().add(requestOne ).add(requestTwo ).execute().actionGet(); // You will get all individual responses from MultiSearchResponse#getResponses()long nbHits =0;for(MultiSearchResponse.Item item : sr.getResponses()){SearchResponse response = item.getResponse(); nbHits += response.getHits().getTotalHits();} MULTI SEARCH API
ElasticSearch The bulk API makes it possible to perform many index/delete operations in a single API call. This can greatly increase the indexing speed. Example $ cat requests{"index":{"_index":"test","_type":"type1","_id":"1"}}{"field1":"value1"}$ curl-s -XPOST localhost:9200/_bulk--data-binary@requests;echo{"took":7,"items":[{"create":{"_index":"test","_type":"type1","_id":"1","_version":1}}]} Bulk API
ElasticSearch The following snippet captures the basic structure of aggregations: "aggregations" : { "<aggregation_name>" : { "<aggregation_type>" : { <aggregation_body> } [,"aggregations" : { [<sub_aggregation>]+ } ]? } [,"<aggregation_name_2>" : { ... } ]*} Aggregations