Comment detail
一部のHTMLタグを通すフィルタ (Nested Flatten)正規表現ないとこれはキツそうですね・・・ <br>がエスケープされるので<brをリストにいれたほうがいいと思います。あとattrがasLowercaseされていないような。HREFが削除されてしまいます。
次のお題向けに整理して書き直しました。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 | | string accepts in out upToAnyOf letters separators |
string := '<a title="(>_<;)" href=''www.google.com'' name=''hoge'' target=_blank>link</a> <blink>and</blink> <strong onClick=''alert("NG")''>click<br/>me!</strong>'.
accepts := {#a->#(name href). #strong->#(). #br->#()} as: Dictionary.
string := string copyReplaceAll: '<br>' with: '<br/>'.
in := string readStream.
out := String new writeStream.
upToAnyOf := [:arr | String streamContents: [:ss |
arr := arr copyWith: nil.
[arr includes: in peek] whileFalse: [ss nextPut: in next]]].
letters := Character alphabet asArray, Character alphabet asUppercase.
separators := Character separators, #($/ $>).
[out nextPutAll: (in upTo: $<) escapeEntities. in atEnd] whileFalse: [
| tag lt isClose isAccepted blank rest |
(isClose := in peek == $/) ifTrue: [in next].
tag := upToAnyOf value: separators.
lt := '<', (isClose ifTrue: ['/'] ifFalse: ['']).
(isAccepted := accepts keys includes: tag asLowercase) ifFalse: [lt := lt escapeEntities].
out nextPutAll: lt, tag.
[blank := upToAnyOf value: letters, '>'. {nil. $>} includes: in peek] whileFalse: [
| attr equal value quote |
attr := upToAnyOf value: #($= $>).
equal := in peek == $= ifTrue: [in next asString] ifFalse: [''].
value := (#($' $") includes: (quote := in peek))
ifTrue: [quote asString, (in next; upTo: quote), quote asString]
ifFalse: [upToAnyOf value: #($ $>)].
out nextPutAll: (isAccepted
ifFalse: [blank, attr, equal, value escapeEntities]
ifTrue: [((accepts at: tag) includes: attr)
ifTrue: [blank, attr, equal, value] ifFalse: ['']])].
rest := blank, (in peek == $> ifTrue: [in next asString] ifFalse: ['']).
out nextPutAll: (isAccepted ifTrue: [rest] ifFalse: [rest escapeEntities])].
World findATranscript: nil.
Transcript cr; show: out contents
"=> <a href='www.google.com' name='hoge'>link</a> <blink>and</blink> <strong>click<br/>me!</strong> "
|






sumim
#2761()
[
Smalltalk
]
Rating0/0=0.00
例によって正規表現が使えないので手続き的に。この調子だと、より複雑なことが要求される続編が思いやられます…(^_^;)。
| string in tag out save rest | string := '<a href=''www.google.com''>link</a> <blink>and</blink> <strong onClick=''alert("NG")''>click<br/>me!</strong>'. in := string readStream. out := String new writeStream. [in atEnd] whileFalse: [ out nextPutAll: (in upTo: $<). in back. save := in position. tag := in upTo: Character space. (tag includes: $/) ifTrue: [in position: save. tag := in upTo: $>. in back]. out nextPutAll: ((#('<a' '<br/' '<strong' '</a' '</strong') includes: tag asLowercase) ifTrue: [tag] ifFalse: [tag := '<', tag allButFirst]). tag := tag asLowercase. [save := in position. (rest := in upTo: $>) includes: $=] whileTrue: [ | attr quote data | in position: save. attr := in upTo: $=. quote := (#($' $") includes: in peek) ifTrue: [in next] ifFalse: [Character space]. data := in upTo: quote. quote := quote = Character space ifTrue: [''] ifFalse: [quote asString]. data := attr, '=', quote, data, quote. in skipSeparators. (tag = '<a' and: [#(href name) includes: attr]) ifTrue: [out space; nextPutAll: data]. (#('<a' '<br/' '<strong') includes: tag) ifFalse: [ out space. data do: [:chr | chr = $< ifTrue: [out nextPutAll: '<'] ifFalse: [out nextPut: chr]]]]. out nextPutAll: rest, '>']. ^out contents "=> '<a href=''www.google.com''>link</a> <blink>and</blink> <strong>click<br/>me!</strong>' "Rating0/0=0.00-0+
2 replies [ reply ]