标签:%s continue states hand 精度 == Matter length choice
这里介绍一下String和MessageFormat中的format方法的差异以及实现原理。
一、两者的使用场景
String.format:for layout justification and alignment, common formats for numeric, string, and date/time data, and locale-specific output.
MessageFormat.format:to produce concatenated messages in language-neutral way.
二、两者的性能比较
MeesageFormat由于是一个在先分析的指定位置插入相应的值,性能要好于采用正则表达式查找占位符的String.format方法。MessageFormat > String
三、以下是异常的情况
String message = MessageFormat.format("name={0}, age={}", 25, "huhx"); // java.lang.IllegalArgumentException: can‘t parse argument number: String string = String.format("name=%s, age=%d", "huhx"); // java.util.MissingFormatArgumentException: Format specifier ‘%d‘
我们通过下面的简单的例子来分析两者的原理:
public void messageFormat() { String string = String.format("name=%s, age=%d", "huhx", 25); String message = MessageFormat.format("name={1}, age={0}, {1}", 25, "huhx"); System.out.println(string); System.out.println(message); } // name=huhx, age=25 // name=huhx, age=25, huhx
String.format内部的实现是一个Formatter,使用了正则表达式来查找占位数据的。我们在这里贴出它实现的源代码。
1 public Formatter format(Locale l, String format, Object ... args) { 2 ensureOpen(); 3 // index of last argument referenced 4 int last = -1; 5 // last ordinary index 6 int lasto = -1; 7 8 FormatString[] fsa = parse(format); 9 for (int i = 0; i < fsa.length; i++) { 10 FormatString fs = fsa[i]; 11 int index = fs.index(); 12 try { 13 switch (index) { 14 case -2: // fixed string, "%n", or "%%" 15 fs.print(null, l); 16 break; 17 case -1: // relative index 18 if (last < 0 || (args != null && last > args.length - 1)) 19 throw new MissingFormatArgumentException(fs.toString()); 20 fs.print((args == null ? null : args[last]), l); 21 break; 22 case 0: // ordinary index 23 lasto++; 24 last = lasto; 25 if (args != null && lasto > args.length - 1) 26 throw new MissingFormatArgumentException(fs.toString()); 27 fs.print((args == null ? null : args[lasto]), l); 28 break; 29 default: // explicit index 30 last = index - 1; 31 if (args != null && last > args.length - 1) 32 throw new MissingFormatArgumentException(fs.toString()); 33 fs.print((args == null ? null : args[last]), l); 34 break; 35 } 36 } catch (IOException x) { 37 lastException = x; 38 } 39 } 40 return this; 41 }
以下是Formatter内部的正则表达式:
private static final String formatSpecifier = "%(\\d+\\$)?([-#+ 0,(\\<]*)?(\\d+)?(\\.\\d+)?([tT])?([a-zA-Z%])";
使用formatSpecifier的正则表达式应用于name=%s, age=%d,会生成一个列表,也就是上述第9行代码的执行结果。里面大概记录了以下的内容,大小为4。
1、类型为FixedString,内容为name= 2、类型为FormatSpecifier,内容为%s 3、类型为FixedString,内容为, age= 4、类型为FormatSpecifier,内容为%d
这里对FixedString和FormatSpecifier做一个简单的说明。两者都是实现了FormatString接口。其中FormatString暴露了以下的三个方法。
private interface FormatString { int index(); void print(Object arg, Locale l) throws IOException; String toString(); }
如果是FixedString类型的,index为-2。如果是FormatSpecifier类型的,index为0。
1、类型为FixedString:使用的fs.print函数是把string内容写到Formatter类里面StringBuilder里。 2、类型为FormatSpecifier:使用fs.print里面的实现比较复杂,处理各种精度、对齐、布局调整等问题。
最后调用Formatter的toString方法,返回内容维护的StringBuilder内容。
public String toString() { ensureOpen(); return a.toString(); }
MessageFormat的原理简单来说就是遍历第一个字符,维护一个{}数组,并且记录了{}的各个位置,各个位置还对应着index(参数的下标)。还是以下面的代码做分析
String message = MessageFormat.format("name={1}, age={0}, {1}", 25, "huhx");
首先它会调用一个applyPattern方法,这里我们先贴出代码。这一行代码执行完,会生成以下有用的信息。
其中offset是一个int数据,里面目前的数据是5,11,13分别代表{0}、{1}和{1}的位置。maxOffset为2代表上面的{n}有3个。argumentNumbers里面的1、0、1代表regex里面的{n}的n的值。这个过程具体可以看下面的代码。
1 public void applyPattern(String pattern) { 2 StringBuilder[] segments = new StringBuilder[4]; 3 // Allocate only segments[SEG_RAW] here. The rest are 4 // allocated on demand. 5 segments[SEG_RAW] = new StringBuilder(); 6 7 int part = SEG_RAW; 8 int formatNumber = 0; 9 boolean inQuote = false; 10 int braceStack = 0; 11 maxOffset = -1; 12 for (int i = 0; i < pattern.length(); ++i) { 13 char ch = pattern.charAt(i); 14 if (part == SEG_RAW) { 15 if (ch == ‘\‘‘) { 16 if (i + 1 < pattern.length() 17 && pattern.charAt(i+1) == ‘\‘‘) { 18 segments[part].append(ch); // handle doubles 19 ++i; 20 } else { 21 inQuote = !inQuote; 22 } 23 } else if (ch == ‘{‘ && !inQuote) { 24 part = SEG_INDEX; 25 if (segments[SEG_INDEX] == null) { 26 segments[SEG_INDEX] = new StringBuilder(); 27 } 28 } else { 29 segments[part].append(ch); 30 } 31 } else { 32 if (inQuote) { // just copy quotes in parts 33 segments[part].append(ch); 34 if (ch == ‘\‘‘) { 35 inQuote = false; 36 } 37 } else { 38 switch (ch) { 39 case ‘,‘: 40 if (part < SEG_MODIFIER) { 41 if (segments[++part] == null) { 42 segments[part] = new StringBuilder(); 43 } 44 } else { 45 segments[part].append(ch); 46 } 47 break; 48 case ‘{‘: 49 ++braceStack; 50 segments[part].append(ch); 51 break; 52 case ‘}‘: 53 if (braceStack == 0) { 54 part = SEG_RAW; 55 makeFormat(i, formatNumber, segments); 56 formatNumber++; 57 // throw away other segments 58 segments[SEG_INDEX] = null; 59 segments[SEG_TYPE] = null; 60 segments[SEG_MODIFIER] = null; 61 } else { 62 --braceStack; 63 segments[part].append(ch); 64 } 65 break; 66 case ‘ ‘: 67 // Skip any leading space chars for SEG_TYPE. 68 if (part != SEG_TYPE || segments[SEG_TYPE].length() > 0) { 69 segments[part].append(ch); 70 } 71 break; 72 case ‘\‘‘: 73 inQuote = true; 74 // fall through, so we keep quotes in other parts 75 default: 76 segments[part].append(ch); 77 break; 78 } 79 } 80 } 81 } 82 if (braceStack == 0 && part != 0) { 83 maxOffset = -1; 84 throw new IllegalArgumentException("Unmatched braces in the pattern."); 85 } 86 this.pattern = segments[0].toString(); 87 }
后面做format工作,根据上述applyPattern分析出来的重要信息。大概的过程就是循环maxOffset,得到对应的offset下标。然后把参数插入到对应的位置。比如第一个的参数数字25会插入到pattern的第12位置,而huhx字符串会插入到pattern的第6和第14的位置。组装的一个string返回。以下是format的源码。
1 private StringBuffer subformat(Object[] arguments, StringBuffer result, 2 FieldPosition fp, List<AttributedCharacterIterator> characterIterators) { 3 // note: this implementation assumes a fast substring & index. 4 // if this is not true, would be better to append chars one by one. 5 int lastOffset = 0; 6 int last = result.length(); 7 for (int i = 0; i <= maxOffset; ++i) { 8 result.append(pattern.substring(lastOffset, offsets[i])); 9 lastOffset = offsets[i]; 10 int argumentNumber = argumentNumbers[i]; 11 if (arguments == null || argumentNumber >= arguments.length) { 12 result.append(‘{‘).append(argumentNumber).append(‘}‘); 13 continue; 14 } 15 // int argRecursion = ((recursionProtection >> (argumentNumber*2)) & 0x3); 16 if (false) { // if (argRecursion == 3){ 17 // prevent loop!!! 18 result.append(‘\uFFFD‘); 19 } else { 20 Object obj = arguments[argumentNumber]; 21 String arg = null; 22 Format subFormatter = null; 23 if (obj == null) { 24 arg = "null"; 25 } else if (formats[i] != null) { 26 subFormatter = formats[i]; 27 if (subFormatter instanceof ChoiceFormat) { 28 arg = formats[i].format(obj); 29 if (arg.indexOf(‘{‘) >= 0) { 30 subFormatter = new MessageFormat(arg, locale); 31 obj = arguments; 32 arg = null; 33 } 34 } 35 } else if (obj instanceof Number) { 36 // format number if can 37 subFormatter = NumberFormat.getInstance(locale); 38 } else if (obj instanceof Date) { 39 // format a Date if can 40 subFormatter = DateFormat.getDateTimeInstance( 41 DateFormat.SHORT, DateFormat.SHORT, locale);//fix 42 } else if (obj instanceof String) { 43 arg = (String) obj; 44 45 } else { 46 arg = obj.toString(); 47 if (arg == null) arg = "null"; 48 } 49 50 // At this point we are in two states, either subFormatter 51 // is non-null indicating we should format obj using it, 52 // or arg is non-null and we should use it as the value. 53 54 if (characterIterators != null) { 55 // If characterIterators is non-null, it indicates we need 56 // to get the CharacterIterator from the child formatter. 57 if (last != result.length()) { 58 characterIterators.add( 59 createAttributedCharacterIterator(result.substring 60 (last))); 61 last = result.length(); 62 } 63 if (subFormatter != null) { 64 AttributedCharacterIterator subIterator = 65 subFormatter.formatToCharacterIterator(obj); 66 67 append(result, subIterator); 68 if (last != result.length()) { 69 characterIterators.add( 70 createAttributedCharacterIterator( 71 subIterator, Field.ARGUMENT, 72 Integer.valueOf(argumentNumber))); 73 last = result.length(); 74 } 75 arg = null; 76 } 77 if (arg != null && arg.length() > 0) { 78 result.append(arg); 79 characterIterators.add( 80 createAttributedCharacterIterator( 81 arg, Field.ARGUMENT, 82 Integer.valueOf(argumentNumber))); 83 last = result.length(); 84 } 85 } 86 else { 87 if (subFormatter != null) { 88 arg = subFormatter.format(obj); 89 } 90 last = result.length(); 91 result.append(arg); 92 if (i == 0 && fp != null && Field.ARGUMENT.equals( 93 fp.getFieldAttribute())) { 94 fp.setBeginIndex(last); 95 fp.setEndIndex(result.length()); 96 } 97 last = result.length(); 98 } 99 } 100 } 101 result.append(pattern.substring(lastOffset, pattern.length())); 102 if (characterIterators != null && last != result.length()) { 103 characterIterators.add(createAttributedCharacterIterator( 104 result.substring(last))); 105 } 106 return result; 107 }
java基础---->String和MessageFormat的format方法
标签:%s continue states hand 精度 == Matter length choice
原文地址:https://www.cnblogs.com/huhx/p/baseusejavamessageformat.html