In my previous post we had a look at the general storage architecture of HBase. This post explains how the log works in detail, but bear in mind that it describes the current version, which is 0. I will address the various plans to improve the log for 0.
Another option to get more data into memory is to reduce the block size of the data stored in disk. When a row is requested by client, the block corresponding to where the row is stored on disk store file is read into memory cache before sending it back the requested data to the client.
So by decreasing the block size more relevant data can be stored in cache which can improve read performance. Reducing the block size is not good for all scenarios.
But here is an example where reducing the block size will help. If the use case is to read the latest data about tickers stored in a row and the data size is 5 K in size. Since the probability that users will be reading same popular tickers repeatedly is high, the relevant data which we need to store in memory are the popular ticker data.
In this scenario reducing the block size to 8 K will improve considerably the amount of relevant data stored in cache and will improve the query performance.
One drawback of reducing the blocksize is the increase in index and meta data stored in store files and cache which may be small price to pay for the performance gain.
Given that physical servers currently come with large memory say GBcan we allocate HBase heapsizes of say 96 GB so that we can store large amount of data in cache?
The answer is no. We will look at the reason and an option to leverage the available physical memory next. More notes on this category can be found here. For any one interested in visuals, the following presentation may help.
Posted by Biju Nair.Browse Current Job Openings Below. We believe that candidates are also our customers and we treat you as such. Mail your CV to us for inclusion in our inhouse database for use of our search consultants and allows us to find a suitable opening for you.
For the sake and security of your own financial and lifestyle future if you or your company are looking for a quicker and easier way to achieve your goals and realize your dreams do nothing having anything to do with business, money, your job, or the Internet until you’ve book-marked this website and invested a few minutes of your time reviewing the following critically important.
Watch out for swapping. Set swappiness to 0.
comments powered by Disqus. Poor write Performance by HBase client. Ask Question. up vote 0 down vote favorite. In my application, I place the HBase write call in a Queue (async manner) and draining the queue using 20 Consumer threads.
On hitting web-server locally using curl, I'm able to see TPS of for HBase after curl completes, but with Load-test where request. Best practices to optimize Phoenix performance. Best practices to optimize Phoenix performance. The most important aspect of Phoenix performance is to optimize the underlying HBase.
Phoenix creates a relational data model atop HBase that converts SQL queries into HBase operations, such as scans. consider disabling the write-ahead log. The WAL resides in HDFS in the /hbase/WALs/ directory (prior to HBase , they were stored in /hbase/.logs/), with subdirectories per region.
For more general information about the concept of write ahead logs, see the Wikipedia Write-Ahead Log article.