In the last post I discussed what are various consideration available at hand to choose Nutanix node for our business use case. This post is further extension of it. In this post I would like to describe various use cases of creating containers. First and foremost we must understand what is container. Nutanix documentation explains containers are logical extension of storage pool. All containers are backed by single pool. If you have a storage pool of 10 TB and you create 2 containers out of this storage pool, each container will have 10 TB storage space. This explains the logical extension. By default all containers are thin provisioned. So you don’t need to create thin provisioning at hypervizor layer. Now this is the reason I insist you wear storage administrator thinking hat. Normally storage admin would present either thick or thin LUNs to the hypervizor. I can imagine a question popping up in your mind how you do it. I have address this in next post.
Nutanix recommends creating single pool and single storage container to dynamically optimise the distribution of resources like capacity and IOPS. However you will always find a need to create more than one container. What are those needs?
Answer to any of the below question means you need more than one container
- Do you need more RF=3 for some application?
- Do you need to enable compression feature for some applications?
- Do you need to enable different deduplication policy at storage level?
- Do you need Erasure Coding along with RF for some applications?
Answer to any of the question suggests you need more than one container. Before enabling any of the features you need to know what these features are and its use cases are and as an designer what would be its impact.
Redundancy Factor (RF)
If you need to protect some applications at RF=3, it means two nodes of nutanix can fail simultaneously but applications won’t be impacted. Storage admin cap please. Here again we are talking about data protection and not about VM protection. In other words it means if you have some container at RF=2 and more than one more node fails there is potential that some extents of VM are unavailable. RF=3 is very unique business case. May not apply in all cases and there are prerequisites to meet. Refer Visio below.
In Nutanix terminology it is referred as mapreduce compression. Data is compressed but when? Well you compressed it as it is written or after it is written. What is good for us? Storage admin hat? Not really. You need hypervizor admin hat this time.
If you want your data to be compressed as it is written, you must know the nature of the data which is written on the storage. Nutanix document states, use it for sequential workloads e.g. Hadoop, data analytic. Database Log files by very nature are sequential in nature, it might occur to you as a good candidate. It can be as long as it is not compressed natively. Nutanix recommends not to use compression feature in cases where data is natively compressed.
Data is compressed after it is written to the disk. This disk is capacity tier. Nutanix recommends to use post compression where there is write once and read frequently data accessed. E.g. Home folder, archiving & backup solution.
This is must for any storage pool. By all means a popular feature.
Data is duplicated as and when it is written to the disk. In Nutanix term it is referred as fingerprinting. What is the best data which can make maximum out of this? Persistent desktop, full clones and P2V. Biggest advantage you get in in-line deduplication is maximum space in performance tier. Performance tier is made up of SSD+Memory. In storage world it means write cache (sold at premium) by storage vendors. So you can understand its impact the moment you think from storage admin perspective.
On-disk deduplication is mostly focused on capacity savings.
Erasure Coding (EC)
EC gives you capacity saving over and above compression and deduplication. Strongly recommended when space optimization is your goal or you are scared of losing 50% of capacity. However It is mythical, refer Josh post on it. Good use cases is again file servers, backup and archival. You could relate compression and EC go hand in hand.
Where not to use these features
- Do not use compression where applications are natively compressing data. E,g JPEG, Database, Heavy random writes, Frequent overwrites. Similar do not use EC feature where there are frequent over writes. Rate of space savings return diminishes as cluster size increases beyond 6 nodes. EC has some prerequisite which I’ve explained in mind map in my next post as prerequisite and recommendations.
- Deduplication strongly discouraged when using linked clones.
I tried to articulate my 855 words in Visio below. What are the prerequisites which is referred as WHAT. It means what you need to enable this feature. Worth noting it is licenses type you need. For example on disk deduplication, EC, RF=3 and post compression you need Pro license. WHY: denotes why you need this feature. I have discussed Reserved Capacity and Advertise capacity in next post
Hope you find it useful.
Related Posts :-http://www.vzare.com/?p=4451