今天就跟大家聊聊有關(guān)Docker對JVM的限制有哪些,可能很多人都不太了解,為了讓大家更加了解,小編給大家總結(jié)了以下內(nèi)容,希望大家根據(jù)這篇文章可以有所收獲。

成都創(chuàng)新互聯(lián)公司是一家專注于網(wǎng)站制作、網(wǎng)站建設(shè)與策劃設(shè)計,習水網(wǎng)站建設(shè)哪家好?成都創(chuàng)新互聯(lián)公司做網(wǎng)站,專注于網(wǎng)站建設(shè)10年,網(wǎng)設(shè)計領(lǐng)域的專業(yè)建站公司;建站業(yè)務(wù)涵蓋:習水等地區(qū)。習水做網(wǎng)站價格咨詢:18982081108
首先說一個老生常談的限制:我們在對Docker中的Java應(yīng)用使用諸如jmap等命令時常常會報錯:
Can't attach to the process: ptrace(PTRACE_ATTACH, ..).
這個主要是因為像jstack、jmap等工具主要是通過兩種方式來實現(xiàn)的:
Attach機制(也可以叫做Vitural Machine.attach(),主要是用通過Socket 與目標JVM的Attach Listener線程進行交互。
Serviceability Agent(其實也是一種Attach,在Linux中要靠系統(tǒng)調(diào)用ptrace來實現(xiàn)).
而 Docker 自 1.10 版本開始,默認的 seccomp 配置文件中禁用了 ptrace,所以一些通過SA進行的操作如:jmap -heap就會報錯,而Docker官方也給出了解決方法:
使用–cap-add=SYS_PTRACE明確添加指定功能:[docker run --cap-add=SYS_PTRACE ...]
關(guān)閉 seccomp /將ptrace添加到允許的名單中:docker run --security-opt seccomp:unconfined ...
除了這個限制,前一段時間我在翻JDK的JDK BUG SYSTEM的時候無意間發(fā)現(xiàn)了這么一個Bug:JDK-8140793
getAvailableProcessors may incorrectly report the number of cpus in Docker container
BUG大致描述的現(xiàn)象是,Java在Docker容器中運行時,獲取到的CPU的數(shù)目可能是不正確的。
Docker大家都知道是依托于Cgroups和Namespace的,而Cgroups 是一種 Linux 內(nèi)核功能,可以限制和隔離進程的資源使用情況(CPU、內(nèi)存、磁盤 I/O、網(wǎng)絡(luò)等),所以我猜可能是JVM在運行時并沒有讀取到Docker使用Cgroups進行的限制.
繼續(xù)查看這個BUG,發(fā)現(xiàn)狀態(tài)是RESOLVED,于是繼續(xù)翻找,在官方的Blog中發(fā)現(xiàn)了這么一篇文章
:《Java SE support for Docker CPU and memory limits》(文章關(guān)聯(lián)了反應(yīng)Docker中CPU計算出錯的JDK-8140793、Docker中內(nèi)存限制的增強JDK-8170888、容器檢測和資源配置使用率增強的JDK-8146115).
文章中提到在JDK8u121之前的版本中(Java SE 8u121 and earlier),JVM讀取的CPU數(shù)以及內(nèi)存等都是不受到Cgroups限制的數(shù)據(jù),那么這么做又會出現(xiàn)什么問題呢?據(jù)我所知,在我們不顯式的指明一些參數(shù)的時候,往往會用到JVM讀取的數(shù)據(jù)做一些默認的配置。比如如果不顯式的指定 -XX:ParallelGCThreads and -XX:CICompilerCount,那么JVM就會根據(jù)讀到的CPU數(shù)目進行計算來設(shè)置數(shù)值,如在計算Parallel GC的Threads數(shù)目的地方runtime\vm_version.cpp(以下基于openJDK1.8 b120):
if (FLAG_IS_DEFAULT(ParallelGCThreads)) {
assert(ParallelGCThreads == 0, "Default ParallelGCThreads is not 0");// For very large machines, there are diminishing returns// for large numbers of worker threads. Instead of// hogging the whole system, use a fraction of the workers for every// processor after the first 8. For example, on a 72 cpu machine// and a chosen fraction of 5/8// use 8 + (72 - 8) * (5/8) == 48 worker threads.unsigned int ncpus = (unsigned int) os::active_processor_count();return (ncpus <= switch_pt) ?
ncpus :
(switch_pt + ((ncpus - switch_pt) * num) / den);
} else {return ParallelGCThreads;
}進入到獲取CPU數(shù)目的os::active_processor_count()(linux實現(xiàn)os_linux.cpp)
int os::active_processor_count() { // Linux doesn't yet have a (official) notion of processor sets,
// so just return the number of online processors.
int online_cpus = ::sysconf(_SC_NPROCESSORS_ONLN);
assert(online_cpus > 0 && online_cpus <= processor_count(), "sanity check"); return online_cpus;
}我們發(fā)現(xiàn)確實是通過::sysconf(_SC_NPROCESSORS_ONLN)來讀取的物理機的CPU,如此看來GC的線程數(shù)目的計算就會出現(xiàn)一定的問題,同理JIT compiler threads也會遇到同樣的問題。
而除了CPU的讀取會出錯,內(nèi)存也是如此,我們在不顯式的指定一些參數(shù)時如-Xmx(MaxHeapSize)、-Xms(InitialHeapSize)時,JVM會根據(jù)它讀取到的機器的內(nèi)存大小做一些默認的設(shè)置如:
void Arguments::set_heap_size() { if (!FLAG_IS_DEFAULT(DefaultMaxRAMFraction)) {// Deprecated flagFLAG_SET_CMDLINE(uintx, MaxRAMFraction, DefaultMaxRAMFraction);
} const julong phys_mem =
FLAG_IS_DEFAULT(MaxRAM) ? MIN2(os::physical_memory(), (julong)MaxRAM)
: (julong)MaxRAM; // If the maximum heap size has not been set with -Xmx,
// then set it as fraction of the size of physical memory,
// respecting the maximum and minimum sizes of the heap.
if (FLAG_IS_DEFAULT(MaxHeapSize)) {
julong reasonable_max = phys_mem / MaxRAMFraction;if (phys_mem <= MaxHeapSize * MinRAMFraction) { // Small physical memory, so use a minimum fraction of it for the heap reasonable_max = phys_mem / MinRAMFraction;
}
.
.
.
.
}
}其中讀取內(nèi)存的os::physical_memory()讀取也是physical memory,而這在Docker中運行可能引發(fā)一系列的錯誤比如被OOMKiller給殺掉(參考).
可見當我們使用一些比較老的JDK8版本時,如果我們沒有顯式指定一些參數(shù)可能會遇到一些稀奇古怪的問題,我在JDK-8146115中發(fā)現(xiàn)此對Docker支付的增強已經(jīng)在JDK10中實現(xiàn)了,使用-XX:+UseContainerSupport可以開啟容器支持,而且這一增強已經(jīng)被backport到了JDK8的一些新版本中(JDK8u131之后的版本).
我下載了新版本的OpenJDK8,翻閱源碼發(fā)現(xiàn)Oracle果然做了相應(yīng)的處理.
原先os::active_processor_count()變成了:
// Determine the active processor count from one of// three different sources://// 1. User option -XX:ActiveProcessorCount// 2. kernel os calls (sched_getaffinity or sysconf(_SC_NPROCESSORS_ONLN)// 3. extracted from cgroup cpu subsystem (shares and quotas)//// Option 1, if specified, will always override.// If the cgroup subsystem is active and configured, we// will return the min of the cgroup and option 2 results.// This is required since tools, such as numactl, that// alter cpu affinity do not update cgroup subsystem// cpuset configuration files.int os::active_processor_count() { // User has overridden the number of active processors
if (ActiveProcessorCount > 0) {if (PrintActiveCpus) {
tty->print_cr("active_processor_count: ""active processor count set by user : %d",
ActiveProcessorCount);
}return ActiveProcessorCount;
} int active_cpus; if (OSContainer::is_containerized()) {
active_cpus = OSContainer::active_processor_count();if (PrintActiveCpus) {
tty->print_cr("active_processor_count: determined by OSContainer: %d",
active_cpus);
}
} else {
active_cpus = os::Linux::active_processor_count();
} return active_cpus;
}可以清晰的看到,如果有-XX:ActiveProcessorCount參數(shù)則使用參數(shù),如果沒有就會去OSContainer::is_containerized()判斷是否容器化:
inline bool OSContainer::is_containerized() {
assert(_is_initialized, "OSContainer not initialized"); return _is_containerized;
}而_is_containerized是由Threads::create_vm調(diào)用OSContainer::init()時檢查虛擬機是否運行在容器中得來的(具體方法太長了):
/* init
*
* Initialize the container support and determine if
* we are running under cgroup control.
*/void OSContainer::init() { int mountid; int parentid; int major; int minor;
FILE *mntinfo = NULL;
FILE *cgroup = NULL; char buf[MAXPATHLEN+1]; char tmproot[MAXPATHLEN+1]; char tmpmount[MAXPATHLEN+1]; char tmpbase[MAXPATHLEN+1]; char *p;
jlong mem_limit;
assert(!_is_initialized, "Initializing OSContainer more than once");
_is_initialized = true;
_is_containerized = false;
_unlimited_memory = (LONG_MAX / os::vm_page_size()) * os::vm_page_size(); if (PrintContainerInfo) {
tty->print_cr("OSContainer::init: Initializing Container Support");
} if (!UseContainerSupport) {if (PrintContainerInfo) {
tty->print_cr("Container Support not enabled");
}return;
}
...........
_is_containerized = true;
}方法就是對一些地方做了檢查,如UseContainerSupport參數(shù)是否開啟、/proc/self/mountinfo、/proc/self/cgroup是否可讀等等,如果判斷JVM運行在容器中,那么就會調(diào)用OSContainer::active_processor_count()獲取容器限制的CPU數(shù)目:
/* active_processor_count
*
* Calculate an appropriate number of active processors for the
* VM to use based on these three inputs.
*
* cpu affinity
* cgroup cpu quota & cpu period
* cgroup cpu shares
*
* Algorithm:
*
* Determine the number of available CPUs from sched_getaffinity
*
* If user specified a quota (quota != -1), calculate the number of
* required CPUs by dividing quota by period.
*
* If shares are in effect (shares != -1), calculate the number
* of CPUs required for the shares by dividing the share value
* by PER_CPU_SHARES.
*
* All results of division are rounded up to the next whole number.
*
* If neither shares or quotas have been specified, return the
* number of active processors in the system.
*
* If both shares and quotas have been specified, the results are
* based on the flag PreferContainerQuotaForCPUCount. If true,
* return the quota value. If false return the smallest value
* between shares or quotas.
*
* If shares and/or quotas have been specified, the resulting number
* returned will never exceed the number of active processors.
*
* return:
* number of CPUs
*/int OSContainer::active_processor_count() { int quota_count = 0, share_count = 0; int cpu_count, limit_count; int result;
cpu_count = limit_count = os::Linux::active_processor_count(); int quota = cpu_quota(); int period = cpu_period(); int share = cpu_shares();
...........
}通過注釋發(fā)現(xiàn),此時的計算是通過cgroup cpu quota & cpu period、cgroup cpu shares得來的,而Docker可以通過–cpu-period、–cpu-quota等來進行設(shè)置。
同理,對于Memory的處理,如果不標明-Xmx,JVM可以開啟*-XX:+UnlockExperimentalVMOptions*、 -XX:+UseCGroupMemoryLimitForHeap這兩個參數(shù),來使得JVM使用Linux cgroup的配置確定最大Java堆大小。
Arguments::set_heap_size()方法:
void Arguments::set_heap_size() { if (!FLAG_IS_DEFAULT(DefaultMaxRAMFraction)) {// Deprecated flagFLAG_SET_CMDLINE(uintx, MaxRAMFraction, DefaultMaxRAMFraction);
}
julong phys_mem =
FLAG_IS_DEFAULT(MaxRAM) ? MIN2(os::physical_memory(), (julong)MaxRAM)
: (julong)MaxRAM; // Experimental support for CGroup memory limits
if (UseCGroupMemoryLimitForHeap) {// This is a rough indicator that a CGroup limit may be in force// for this processconst char* lim_file = "/sys/fs/cgroup/memory/memory.limit_in_bytes";
FILE *fp = fopen(lim_file, "r");if (fp != NULL) {
julong cgroup_max = 0; int ret = fscanf(fp, JULONG_FORMAT, &cgroup_max); if (ret == 1 && cgroup_max > 0) {// If unlimited, cgroup_max will be a very large, but unspecified// value, so use initial phys_mem as a limitif (PrintGCDetails && Verbose) { // Cannot use gclog_or_tty yet. tty->print_cr("Setting phys_mem to the min of cgroup limit ("JULONG_FORMAT "MB) and initial phys_mem ("JULONG_FORMAT "MB)", cgroup_max/M, phys_mem/M);
}
phys_mem = MIN2(cgroup_max, phys_mem);
} else {
warning("Unable to read/parse cgroup memory limit from %s: %s",
lim_file, errno != 0 ? strerror(errno) : "unknown error");
}
fclose(fp);
} else {
warning("Unable to open cgroup memory limit file %s (%s)", lim_file, strerror(errno));
}
}
....................
}JVM會通過使用cgroup文件系統(tǒng)中的memory_limit()值初始化os::physical_memory()中的值,不過我有注意到注釋上有Experimental support的字樣,估計不太成熟哈哈還只是實驗性質(zhì)的支持。
這么看Java在Docker中運行小坑與限制還不少呢,不知道哪里設(shè)置的不好就會出現(xiàn)一些莫名其妙的問題,我們最好還是根據(jù)Docker的配置來顯式設(shè)置JVM的參數(shù)以避免大部分問題。如果還是有問題,可以考慮下升級較高版本的JDK8u,如果成本高不想升級請參考方案來外部加載一些庫進行攔截修改
看完上述內(nèi)容,你們對Docker對JVM的限制有哪些有進一步的了解嗎?如果還想了解更多知識或者相關(guān)內(nèi)容,請關(guān)注創(chuàng)新互聯(lián)行業(yè)資訊頻道,感謝大家的支持。
文章題目:Docker對JVM的限制有哪些
網(wǎng)頁網(wǎng)址:http://www.chinadenli.net/article4/iiihoe.html
成都網(wǎng)站建設(shè)公司_創(chuàng)新互聯(lián),為您提供網(wǎng)站改版、軟件開發(fā)、網(wǎng)站維護、服務(wù)器托管、、做網(wǎng)站
聲明:本網(wǎng)站發(fā)布的內(nèi)容(圖片、視頻和文字)以用戶投稿、用戶轉(zhuǎn)載內(nèi)容為主,如果涉及侵權(quán)請盡快告知,我們將會在第一時間刪除。文章觀點不代表本網(wǎng)站立場,如需處理請聯(lián)系客服。電話:028-86922220;郵箱:631063699@qq.com。內(nèi)容未經(jīng)允許不得轉(zhuǎn)載,或轉(zhuǎn)載時需注明來源: 創(chuàng)新互聯(lián)