一、说明

做这件事的目的是为了了解一条数据库记录从创建到使用的一个情况;
查询分布时间计算方式采用Top Percentile方式，就是按一定排序的数据，前面xx%的最大值是多少;
TP999 1ms 代表某接口99.9%的响应都在1ms之内;

最终的目的也就是为了知道数据多久之后可以打入冷宫，使用廉价存储;
冷热数据分级处理有利于在性能和成本上达到一定的平衡;
如把内存缓存时间设置为tp90所处的时间，那么90%的数据都能快速返回，其它少量数据回源处理;

关键字
java格式化输出
java8
stream
parallelStream
分组
排序
DoubleSummaryStatistics数据分析
TP999

二、效果

日志源数据预览

1	19-10-24.14:47:12.721 [THREAD-22000-18-T-17] INFO FacadeImpl - response yw:jiaoyi, orderId:123456, time:2019-10-24T14:46:44

yw	count	min(ms)	max(ms)	tp50(ms)	tp90(ms)	tp99(ms)	tp999(ms)
hisen	1000	1	10000	20	60	90	130
hisen-1	200	2	16000	16	50	70	110

ps 输出是格式化的数据，并不是表格，可以通过:

world->粘贴输出文本->插入->表格->文本转换成表格->空格

即可完成文字到表格转换

这种过程可能比较low，但是也需要时间去处理，过程中还得配合linux命令等整合文本;

三、代码

完整：github-CallerAnalyze.java
摘要如下：

/**
 * @author hisenyuan
 * @date 2019-10-25 23:20
 */
public class CallerAnalyze {
    public static void main(String[] args) throws IOException {
        String filePath = "/Users/hisenyuan/yw/yw.log";
        ArrayList<CallerTimeVo> callerTimeVos = getCallerTimeVos(filePath);

        System.out.println("size:" + callerTimeVos.size());
        System.out.printf("%-20s%-20s%-20s%-20s%-20s%-20s%-20s%-20s", "yw", "count", "min(ms)", "max(ms)", "tp50(ms)", "tp90(ms)", "tp99(ms)", "tp999(ms)");
        System.out.println();


        Map<String, List<CallerTimeVo>> callerMap = callerTimeVos.parallelStream().collect(Collectors.groupingBy(CallerTimeVo::getCaller));

        callerMap.entrySet()
                .stream()
                .sorted(Comparator.comparingInt(value -> value.getValue().size()))
                .forEach(stringListEntry -> {
                    String caller = stringListEntry.getKey();
                    System.out.printf("%-20s", caller);
                    List<Long> sorted = stringListEntry.getValue()
                            .parallelStream()
                            .map(CallerTimeVo::getDuration)
                            .sorted(Long::compareTo)
                            .collect(Collectors.toList());
                    calTime(sorted);
                });
    }
    private static ArrayList<CallerTimeVo> getCallerTimeVos(String filePath) throws IOException {
        ArrayList<CallerTimeVo> callerTimeVos = Lists.newArrayList();
        Files.asCharSource(new File(filePath), Charset.forName("UTF-8")).readLines(new LineProcessor<String>() {
            @Override
            public boolean processLine(String line) {
                // 处理每一行
                CallerTimeVo vo = getCreateTimeVo(line);
                callerTimeVos.add(vo);
                // false 会中断操作
                return true;
            }

            @Override
            public String getResult() {
                return null;
            }
        });
        return callerTimeVos;
    }
}

Personal Technology Blog

各系统查询数据时间分布情况统计-日志处理

一、说明

二、效果

三、代码