项目作者: mtumilowicz

项目描述 :
Overview of java String concatenation compilation: java 8 vs java 9.
高级语言: Java
项目地址: git://github.com/mtumilowicz/java9-string-concat.git
创建时间: 2018-11-14T19:15:17Z
项目社区:https://github.com/mtumilowicz/java9-string-concat

开源协议:

下载


java9-string-concat

Overview of java String concatenation compilation: java 8 vs java 9.

Reference: http://www.pellegrino.link/2015/08/22/string-concatenation-with-java-8.html
Reference: https://www.guardsquare.com/en/blog/string-concatenation-java-9-untangling-invokedynamic
Reference: https://www.guardsquare.com/en/blog/string-concatenation-java-9-conversion-confusion
Reference: https://arnaudroger.github.io/blog/2017/06/14/CompactStrings.html
Reference: https://stackoverflow.com/questions/46512888/how-is-string-concatenation-implemented-in-java-9

preface

String concatenation is one of the most well known caveat in Java.

java 8

  • all of the substrings building the final String are known at compile
    time:

    1. @Test
    2. public void nonLoopConcatenation() {
    3. String a = "a";
    4. String b = "b";
    5. System.out.println(a + b);
    6. }

    is compiled to:

    1. public nonLoopConcatenation()V
    2. @Lorg/junit/Test;()
    3. L0
    4. LINENUMBER 10 L0
    5. LDC "a"
    6. ASTORE 1
    7. L1
    8. LINENUMBER 11 L1
    9. LDC "b"
    10. ASTORE 2
    11. L2
    12. LINENUMBER 12 L2
    13. GETSTATIC java/lang/System.out : Ljava/io/PrintStream;
    14. NEW java/lang/StringBuilder
    15. DUP
    16. INVOKESPECIAL java/lang/StringBuilder.<init> ()V
    17. ALOAD 1
    18. INVOKEVIRTUAL java/lang/StringBuilder.append (Ljava/lang/String;)Ljava/lang/StringBuilder;
    19. ALOAD 2
    20. INVOKEVIRTUAL java/lang/StringBuilder.append (Ljava/lang/String;)Ljava/lang/StringBuilder;
    21. INVOKEVIRTUAL java/lang/StringBuilder.toString ()Ljava/lang/String;
    22. INVOKEVIRTUAL java/io/PrintStream.println (Ljava/lang/String;)V
    23. L3
    24. LINENUMBER 13 L3
    25. RETURN
    26. L4
    27. LOCALVARIABLE this LStringConcatBenchmarkTest; L0 L4 0
    28. LOCALVARIABLE a Ljava/lang/String; L1 L4 1
    29. LOCALVARIABLE b Ljava/lang/String; L2 L4 2
    30. MAXSTACK = 3
    31. MAXLOCALS = 3

    the most important part is:

    1. NEW java/lang/StringBuilder
    2. DUP
    3. INVOKESPECIAL java/lang/StringBuilder.<init> ()V
    4. ALOAD 1
    5. INVOKEVIRTUAL java/lang/StringBuilder.append (Ljava/lang/String;)Ljava/lang/StringBuilder;
    6. ALOAD 2
    7. INVOKEVIRTUAL java/lang/StringBuilder.append (Ljava/lang/String;)Ljava/lang/StringBuilder;
    8. INVOKEVIRTUAL java/lang/StringBuilder.toString ()Ljava/lang/String;
    9. INVOKEVIRTUAL java/io/PrintStream.println (Ljava/lang/String;)V

    and it is also a bytecode of:

    1. String a = "a";
    2. String b = "b";
    3. System.out.println(new StringBuilder().append(a).append(b));

    it is called: static string concatenation optimisation

  • substrings building the final String are NOT known at compile
    time

    1. @Test
    2. public void loopConcatenation() {
    3. long start = System.currentTimeMillis();
    4. String result = "";
    5. for (int i = 0; i < 50_000; i++) {
    6. result += i;
    7. }
    8. System.out.println(result);
    9. System.out.println(System.currentTimeMillis() - start);
    10. }

    is compiled to:

    1. public loopConcatenation()V
    2. @Lorg/junit/Test;()
    3. L0
    4. LINENUMBER 25 L0
    5. INVOKESTATIC java/lang/System.currentTimeMillis ()J
    6. LSTORE 1
    7. L1
    8. LINENUMBER 27 L1
    9. LDC ""
    10. ASTORE 3
    11. L2
    12. LINENUMBER 29 L2
    13. ICONST_0
    14. ISTORE 4
    15. L3
    16. FRAME APPEND [J java/lang/String I]
    17. ILOAD 4
    18. LDC 50000
    19. IF_ICMPGE L4
    20. L5
    21. LINENUMBER 30 L5
    22. NEW java/lang/StringBuilder
    23. DUP
    24. INVOKESPECIAL java/lang/StringBuilder.<init> ()V
    25. ALOAD 3
    26. INVOKEVIRTUAL java/lang/StringBuilder.append (Ljava/lang/String;)Ljava/lang/StringBuilder;
    27. ILOAD 4
    28. INVOKEVIRTUAL java/lang/StringBuilder.append (I)Ljava/lang/StringBuilder;
    29. INVOKEVIRTUAL java/lang/StringBuilder.toString ()Ljava/lang/String;
    30. ASTORE 3
    31. L6
    32. LINENUMBER 29 L6
    33. IINC 4 1
    34. GOTO L3
    35. L4
    36. LINENUMBER 33 L4
    37. FRAME CHOP 1
    38. GETSTATIC java/lang/System.out : Ljava/io/PrintStream;
    39. ALOAD 3
    40. INVOKEVIRTUAL java/io/PrintStream.println (Ljava/lang/String;)V
    41. L7
    42. LINENUMBER 35 L7
    43. GETSTATIC java/lang/System.out : Ljava/io/PrintStream;
    44. INVOKESTATIC java/lang/System.currentTimeMillis ()J
    45. LLOAD 1
    46. LSUB
    47. INVOKEVIRTUAL java/io/PrintStream.println (J)V
    48. L8
    49. LINENUMBER 36 L8
    50. RETURN
    51. L9
    52. LOCALVARIABLE i I L3 L4 4
    53. LOCALVARIABLE this LStringConcatBenchmarkTest; L0 L9 0
    54. LOCALVARIABLE start J L1 L9 1
    55. LOCALVARIABLE result Ljava/lang/String; L2 L9 3
    56. MAXSTACK = 5
    57. MAXLOCALS = 5

    where the most important part is:

    1. FRAME APPEND [J java/lang/String I]
    2. ILOAD 4
    3. LDC 50000
    4. IF_ICMPGE L4
    5. L5
    6. LINENUMBER 30 L5
    7. NEW java/lang/StringBuilder
    8. DUP
    9. INVOKESPECIAL java/lang/StringBuilder.<init> ()V
    10. ALOAD 3
    11. INVOKEVIRTUAL java/lang/StringBuilder.append (Ljava/lang/String;)Ljava/lang/StringBuilder;
    12. ILOAD 4
    13. INVOKEVIRTUAL java/lang/StringBuilder.append (I)Ljava/lang/StringBuilder;
    14. INVOKEVIRTUAL java/lang/StringBuilder.toString ()Ljava/lang/String;
    15. ASTORE 3
    16. L6
    17. LINENUMBER 29 L6
    18. IINC 4 1
    19. GOTO L3

    which is equivalent of .java:

    1. for (int i = 0; i < 50_000; i++) {
    2. StringBuilder sb = new StringBuilder();
    3. sb.append(result);
    4. sb.append(i);
    5. result = sb.toString();
    6. }

    Note that StringBuilder is initialized, and converted to string
    (sb.toString()) every time - java compiler can not push that stuff
    outside the loop.

    summary

  • no loop

    1. @Test
    2. public void nonLoopConcatenation() {
    3. String a = "a";
    4. String b = "b";
    5. System.out.println(a + b);
    6. }

    is compiled to:

    1. @Test
    2. public void nonLoopConcatenation_usingStringBuilder() {
    3. String a = "a";
    4. String b = "b";
    5. System.out.println(new StringBuilder().append(a).append(b));
    6. }
  • loop

    1. @Test
    2. public void loopConcatenation() {
    3. long start = System.currentTimeMillis();
    4. String result = "";
    5. for (int i = 0; i < 50_000; i++) {
    6. result += i;
    7. }
    8. System.out.println(result);
    9. System.out.println(System.currentTimeMillis() - start);
    10. }

    is compiled to:

    1. @Test
    2. public void loopConcatenation_usingStringBuilder() {
    3. long start = System.currentTimeMillis();
    4. String result = "";
    5. for (int i = 0; i < 50_000; i++) {
    6. StringBuilder sb = new StringBuilder();
    7. sb.append(result);
    8. sb.append(i);
    9. result = sb.toString();
    10. }
    11. System.out.println(result);
    12. System.out.println(System.currentTimeMillis() - start);
    13. }

Using Java 8 - loopConcatenation() takes: 6000-7000 ms

java 9

  • all of the substrings building the final String are known at compile
    time:

    1. @Test
    2. public void nonLoopConcatenation() {
    3. String a = "a";
    4. String b = "b";
    5. System.out.println(a + b);
    6. }

    is compiled to:

    1. public nonLoopConcatenation()V
    2. @Lorg/junit/Test;()
    3. L0
    4. LINENUMBER 10 L0
    5. LDC "a"
    6. ASTORE 1
    7. L1
    8. LINENUMBER 11 L1
    9. LDC "b"
    10. ASTORE 2
    11. L2
    12. LINENUMBER 12 L2
    13. GETSTATIC java/lang/System.out : Ljava/io/PrintStream;
    14. ALOAD 1
    15. ALOAD 2
    16. INVOKEDYNAMIC makeConcatWithConstants(Ljava/lang/String;Ljava/lang/String;)Ljava/lang/String; [
    17. // handle kind 0x6 : INVOKESTATIC
    18. java/lang/invoke/StringConcatFactory.makeConcatWithConstants(Ljava/lang/invoke/MethodHandles$Lookup;Ljava/lang/String;Ljava/lang/invoke/MethodType;Ljava/lang/String;[Ljava/lang/Object;)Ljava/lang/invoke/CallSite;
    19. // arguments:
    20. "\u0001\u0001"
    21. ]
    22. INVOKEVIRTUAL java/io/PrintStream.println (Ljava/lang/String;)V
    23. L3
    24. LINENUMBER 13 L3
    25. RETURN
    26. L4
    27. LOCALVARIABLE this LStringConcatBenchmarkTest; L0 L4 0
    28. LOCALVARIABLE a Ljava/lang/String; L1 L4 1
    29. LOCALVARIABLE b Ljava/lang/String; L2 L4 2
    30. MAXSTACK = 3
    31. MAXLOCALS = 3

    the most important part is:

    1. ALOAD 1
    2. ALOAD 2
    3. INVOKEDYNAMIC makeConcatWithConstants(Ljava/lang/String;Ljava/lang/String;)Ljava/lang/String; [
    4. // handle kind 0x6 : INVOKESTATIC
    5. java/lang/invoke/StringConcatFactory.makeConcatWithConstants(Ljava/lang/invoke/MethodHandles$Lookup;Ljava/lang/String;Ljava/lang/invoke/MethodType;Ljava/lang/String;[Ljava/lang/Object;)Ljava/lang/invoke/CallSite;
    6. // arguments:
    7. "\u0001\u0001"
    8. ]
    9. INVOKEVIRTUAL java/io/PrintStream.println (Ljava/lang/String;)V

    the main difference to java 8 style is:

    • INVOKEDYNAMIC (instead of INVOKEVIRTUAL)
    • StringConcatFactory.makeConcatWithConstants (instead of
      StringBuilder)
  • substrings building the final String are NOT known at compile
    time

    1. @Test
    2. public void loopConcatenation() {
    3. long start = System.currentTimeMillis();
    4. String result = "";
    5. for (int i = 0; i < 50_000; i++) {
    6. result += i;
    7. }
    8. System.out.println(result);
    9. System.out.println(System.currentTimeMillis() - start);
    10. }

    is compiled to:

    1. public loopConcatenation()V
    2. @Lorg/junit/Test;()
    3. L0
    4. LINENUMBER 25 L0
    5. INVOKESTATIC java/lang/System.currentTimeMillis ()J
    6. LSTORE 1
    7. L1
    8. LINENUMBER 27 L1
    9. LDC ""
    10. ASTORE 3
    11. L2
    12. LINENUMBER 29 L2
    13. ICONST_0
    14. ISTORE 4
    15. L3
    16. FRAME APPEND [J java/lang/String I]
    17. ILOAD 4
    18. LDC 50000
    19. IF_ICMPGE L4
    20. L5
    21. LINENUMBER 30 L5
    22. ALOAD 3
    23. ILOAD 4
    24. INVOKEDYNAMIC makeConcatWithConstants(Ljava/lang/String;I)Ljava/lang/String; [
    25. // handle kind 0x6 : INVOKESTATIC
    26. java/lang/invoke/StringConcatFactory.makeConcatWithConstants(Ljava/lang/invoke/MethodHandles$Lookup;Ljava/lang/String;Ljava/lang/invoke/MethodType;Ljava/lang/String;[Ljava/lang/Object;)Ljava/lang/invoke/CallSite;
    27. // arguments:
    28. "\u0001\u0001"
    29. ]
    30. ASTORE 3
    31. L6
    32. LINENUMBER 29 L6
    33. IINC 4 1
    34. GOTO L3
    35. L4
    36. LINENUMBER 33 L4
    37. FRAME CHOP 1
    38. GETSTATIC java/lang/System.out : Ljava/io/PrintStream;
    39. ALOAD 3
    40. INVOKEVIRTUAL java/io/PrintStream.println (Ljava/lang/String;)V
    41. L7
    42. LINENUMBER 35 L7
    43. GETSTATIC java/lang/System.out : Ljava/io/PrintStream;
    44. INVOKESTATIC java/lang/System.currentTimeMillis ()J
    45. LLOAD 1
    46. LSUB
    47. INVOKEVIRTUAL java/io/PrintStream.println (J)V
    48. L8
    49. LINENUMBER 36 L8
    50. RETURN
    51. L9
    52. LOCALVARIABLE i I L3 L4 4
    53. LOCALVARIABLE this LStringConcatBenchmarkTest; L0 L9 0
    54. LOCALVARIABLE start J L1 L9 1
    55. LOCALVARIABLE result Ljava/lang/String; L2 L9 3
    56. MAXSTACK = 5
    57. MAXLOCALS = 5

    So conclusions are similar to nonLoopConcatenation()

summary

How string concatenation is done is a runtime decision, not a compile
time one anymore.

motivations

The reason to change the compiler now in this way is, from the project
description, to “enable future optimizations of String concatenation
without requiring further changes to the bytecode emitted by javac.”.
Dynamic method invocation is an ideal solution for that challenge, as
it delays method implementation to the runtime. The developers of the
Java runtime can then improve the implementation of the factory class,
without all other developers needing to recompile their projects.

Recall that dynamic method invocation in Java works as follows:
first, the compiler places an invokedynamic bytecode instruction
in your method body to indicate that we’re trying to use a dynamic
method there. That indy instruction refers to a bootstrap method,
which is a regular Java method that is stored in a special attribute
in the class file. During runtime, this bootstrap method is called
to dynamically create the method we’re trying to invoke and wrap it
in a container object called a CallSite. Finally, the JVM extracts
a MethodHandle for the newly generated method from the CallSite and
executes the method, manipulating the stack as if it were a regular
method invocation.

stategies

StringConcatFactory offers different strategies to generate the
CallSite divided in byte-code generator using ASM and
MethodHandle-based one.

  • BC_SB: generate the byte-code equivalent to what javac generates
    in Java 8.
  • BC_SB_SIZED: generate the byte-code equivalent to what javac but
    try to estimate the initial size of the StringBuilder.
  • BC_SB_SIZED_EXACT: generate the byte-code equivalent to what javac
    but compute the exact size of the StringBuilder.
  • MH_SB_SIZED: combines MethodHandles that ends up calling the
    StringBuilder with an estimated initial size.
  • MH_SB_SIZED_EXACT: combines MethodHandles that ends up calling the
    StringBuilder with an exact size.
  • MH_INLINE_SIZED_EXACT: combines MethodHandles that creates directly
    the String with an exact size byte[] with no copy.
  • The default and most performant one is MH_INLINE_SIZED_EXACT that
    can lead to 3 to 4 times performance improvement. You can override the
    Strategy on the command line by defining the property
    java.lang.invoke.stringConcat.

Using Java 9 loopConcatenation() takes: 1000-1500 ms.

project description

We provide tests with benchmarks for above examples.

The default java version is 9, if you want to switch it to 8, you
have to modify build.gradle:

  1. sourceCompatibility = 1.8

summary

Please note that irrelevant of java version using of well-known
StringBuilder construction (with preallocated space) is the best
possible approach to that problem.

  1. @Test
  2. public void loopConcatenation_usingStringBuilder_rightWay() {
  3. long start = System.currentTimeMillis();
  4. StringBuilder sb = new StringBuilder(60000);
  5. for (int i = 0; i < 50_000; i++) {
  6. sb.append(i);
  7. }
  8. System.out.println(sb);
  9. System.out.println(System.currentTimeMillis() - start);
  10. }

Using well written loop with StringBuilder problem takes just 40-50 ms