博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
跟益达学Solr5之批量索引JSON数据
阅读量:4026 次
发布时间:2019-05-24

本文共 8651 字,大约阅读时间需要 28 分钟。

        假定你有这样一堆JSON数据,

 

[  {"id":"1", "name":"Red Lobster", "city":"San Francisco, CA", "type":"Sit-down Chain", "state":"California", "tags":["sea food", "sit down"], "price":33.00},  {"id":"2", "name":"Red Lobster", "city":"Atlanta, GA", "type":"Sit-down Chain", "state":"Georgia", "tags":["sea food", "sit-down"], "price":22.00},  {"id":"3", "name":"Red Lobster", "city":"New York, NY", "type":"Sit-down Chain", "state":"New York", "tags":["sea food", "sit-down"], "price":29.00},  {"id":"4", "name":"McDonalds", "city":"San Francisco, CA", "type":"Fast Food", "state":"California", "tags":["fast food", "hamburgers", "coffee", "wi-fi", "breakfast"], "price":9.00},  {"id":"5", "name":"McDonalds", "city":"Atlanta, GA", "type":"Fast Food", "state":"Georgia", "tags":["fast food", "hamburgers", "coffee", "wi-fi", "breakfast"], "price":4.00},  {"id":"6", "name":"McDonalds", "city":"New York, NY", "type":"Fast Food", "state":"New York", "tags":["fast food", "hamburgers", "coffee", "wi-fi", "breakfast"], "price":4.00},  {"id":"7", "name":"McDonalds", "city":"Chicago, IL", "type":"Fast Food", "state":"Illinois", "tags":["fast food", "hamburgers", "coffee", "wi-fi", "breakfast"], "price":4.00},  {"id":"8", "name":"McDonalds", "city":"Austin, TX", "type":"Fast Food", "state":"Texas", "tags":["fast food", "hamburgers", "coffee", "wi-fi", "breakfast"], "price":4.00},  {"id":"9", "name":"Pizza Hut", "city":"Atlanta, GA", "type":"Sit-down Chain", "state":"Georgia", "tags":["pizza", "sit-down", "delivery"], "price":15.00},  {"id":"10", "name":"Pizza Hut", "city":"New York, NY", "type":"Sit-down Chain", "state":"New York", "tags":["pizza", "sit-down", "delivery"], "price":24.00},  {"id":"11", "name":"Pizza Hut", "city":"Austin, TX", "type":"Sit-down Chain", "state":"Texas", "tags":["pizza", "sit-down", "delivery"], "price":18.00},  {"id":"12", "name":"Freddy's Pizza Shop", "city":"Los Angeles, CA", "type":"Local Sit-down", "state":"California", "tags":["pizza", "pasta", "sit-down"], "price":25.00},  {"id":"13", "name":"The Iberian Pig", "city":"Atlanta, GA", "type":"Upscale", "state":"Georgia", "tags":["spanish", "tapas", "sit-down", "upscale"], "price":45.00},  {"id":"14", "name":"Sprig", "city":"Atlanta, GA", "type":"Local Sit-down", "state":"Georgia", "tags":["sit-down", "gluten-free", "southern cuisine"], "price":15.00},  {"id":"15", "name":"Starbucks", "city":"San Francisco, CA", "type":"Coffee Shop", "state":"California", "tags":["coffee", "breakfast"], "price":7.50},  {"id":"16", "name":"Starbucks", "city":"Atlanta, GA", "type":"Coffee Shop", "state":"Georgia", "tags":["coffee", "breakfast"], "price":4.00},  {"id":"17", "name":"Starbucks", "city":"New York, NY", "type":"Coffee Shop", "state":"New York", "tags":["coffee", "breakfast"], "price":6.50},  {"id":"18", "name":"Starbucks", "city":"Chicago, IL", "type":"Coffee Shop", "state":"Illinois", "tags":["coffee", "breakfast"], "price":6.00},  {"id":"19", "name":"Starbucks", "city":"Austin, TX", "type":"Coffee Shop", "state":"Texas", "tags":["coffee", "breakfast"], "price":5.00},  {"id":"20", "name":"Starbucks", "city":"Greenville, SC", "type":"Coffee Shop", "state":"South Carolina", "tags":["coffee", "breakfast"], "price":3.00}]

   你想导入到Solr中进行索引,怎么办?其实Solr的Web UI界面就可以操作,在左侧有个Documents菜单,表示导入Document(当然也支持Document更新)的意思,Document加个s即表示支持批量导入多个Document,如图:

 Document Type即表示你的Document数据来源是什么,是来自于JSON,来自于XML,来自于CVS等等,

 

 Commit Within表示document提交必须在指定的毫秒数内完成,否则提交操作视为超时;

 Overwriter即表示是否覆盖索引目录下已有的索引数据,设置为false即表示不覆盖已有索引只在原来的基础上追加索引数据;

 Boost:表示设置Document的权重,默认值为1.0;

 如果你只是单个JSON对象需要导入,那直接选择Document Type为JSON即可,当你选择Document Type为JSON后,Document Type为Solr Command(raw XML or JSON),只不过这时候JSON数据格式就有特殊要求了,你的JSON数据格式需要这样定义:

{    "add": {        "doc": {.......}    },    "add": {        "doc": {.......}    },    "add": {        "doc": {.......}    },    "add": {        "doc": {.......}    },    "add": {        "doc": {.......}    },   ............. and so on.}

    其中{.........}部分就是你的Document对象,其余部分为固定格式。使用这种格式正好弥补了Document Type为JSON这种方式只能一条一条的导入,效率太低,当你需要批量导入多个Document时,采用这种格式支持批量导入多个Document。

 

    如果你需要导入XML数据,你需要选择Document Type为XML,如图:

 <doc></doc>标签之间的就是你的XML数据,不过它跟Document Type选择为JSON有同样的弊端就是只支持单条导入,如果你需要批量导入XML数据,你同样可以选择Document Type为Solr Command(raw XML or JSON),只不过这时候,数据格式应该是类似这样的:

xxxx
xxxxxxxx
xxxxxxxx
xxxx
xxxxxxxx
xxxxxxxx
xxxx
xxxxxxxx
xxxxxxxx
............ and so on

    如果你想更新Document,那就把<add>元素改成<update>即可,同理还有<delete>你懂的,先前在讲post.jar的时候我有提到过,具体请参阅,OK,说了那么多,那现在我就以JSON数据为例进行一个操作示范,假定我有这样一个JSON数据,如图:

     首先我们需要从JSON数据中提炼出Field域,并在我们的Schema.xml配置文件定义域,如图:
   然后我们需要把传统的JSON数据转换成Solr能识别的格式,如图:

{	"add": {		"doc": {"id":"1", "name":"Red Lobster", "city":"San Francisco, CA", "type":"Sit-down Chain", "state":"California", "tags":["sea food", "sit down"], "price":33.00}	},	"add": {		"doc": {"id":"2", "name":"Red Lobster", "city":"Atlanta, GA", "type":"Sit-down Chain", "state":"Georgia", "tags":["sea food", "sit-down"], "price":22.00}	},	"add": {		"doc": {"id":"3", "name":"Red Lobster", "city":"New York, NY", "type":"Sit-down Chain", "state":"New York", "tags":["sea food", "sit-down"], "price":29.00}	},	"add": {		"doc": {"id":"4", "name":"McDonalds", "city":"San Francisco, CA", "type":"Fast Food", "state":"California", "tags":["fast food", "hamburgers", "coffee", "wi-fi", "breakfast"], "price":9.00}	},	"add": {		"doc": {"id":"5", "name":"McDonalds", "city":"Atlanta, GA", "type":"Fast Food", "state":"Georgia", "tags":["fast food", "hamburgers", "coffee", "wi-fi", "breakfast"], "price":4.00}	},	"add": {		"doc": {"id":"6", "name":"McDonalds", "city":"New York, NY", "type":"Fast Food", "state":"New York", "tags":["fast food", "hamburgers", "coffee", "wi-fi", "breakfast"], "price":4.00}	},	"add": {		"doc": {"id":"7", "name":"McDonalds", "city":"Chicago, IL", "type":"Fast Food", "state":"Illinois", "tags":["fast food", "hamburgers", "coffee", "wi-fi", "breakfast"], "price":4.00}	},	"add": {		"doc": {"id":"8", "name":"McDonalds", "city":"Austin, TX", "type":"Fast Food", "state":"Texas", "tags":["fast food", "hamburgers", "coffee", "wi-fi", "breakfast"], "price":4.00}	},	"add": {		"doc": {"id":"9", "name":"Pizza Hut", "city":"Atlanta, GA", "type":"Sit-down Chain", "state":"Georgia", "tags":["pizza", "sit-down", "delivery"], "price":15.00}	},	"add": {		"doc": {"id":"10", "name":"Pizza Hut", "city":"New York, NY", "type":"Sit-down Chain", "state":"New York", "tags":["pizza", "sit-down", "delivery"], "price":24.00}	},	"add": {		"doc": {"id":"11", "name":"Pizza Hut", "city":"Austin, TX", "type":"Sit-down Chain", "state":"Texas", "tags":["pizza", "sit-down", "delivery"], "price":18.00}	},	"add": {		"doc": {"id":"12", "name":"Freddy's Pizza Shop", "city":"Los Angeles, CA", "type":"Local Sit-down", "state":"California", "tags":["pizza", "pasta", "sit-down"], "price":25.00}	},	"add": {		"doc": {"id":"13", "name":"The Iberian Pig", "city":"Atlanta, GA", "type":"Upscale", "state":"Georgia", "tags":["spanish", "tapas", "sit-down", "upscale"], "price":45.00}	},	"add": {		"doc": {"id":"14", "name":"Sprig", "city":"Atlanta, GA", "type":"Local Sit-down", "state":"Georgia", "tags":["sit-down", "gluten-free", "southern cuisine"], "price":15.00}	},	"add": {		"doc": {"id":"15", "name":"Starbucks", "city":"San Francisco, CA", "type":"Coffee Shop", "state":"California", "tags":["coffee", "breakfast"], "price":7.50}	},	"add": {		"doc": {"id":"16", "name":"Starbucks", "city":"Atlanta, GA", "type":"Coffee Shop", "state":"Georgia", "tags":["coffee", "breakfast"], "price":4.00}	},	"add": {		"doc": {"id":"17", "name":"Starbucks", "city":"New York, NY", "type":"Coffee Shop", "state":"New York", "tags":["coffee", "breakfast"], "price":6.50}	},	"add": {		"doc": {"id":"18", "name":"Starbucks", "city":"Chicago, IL", "type":"Coffee Shop", "state":"Illinois", "tags":["coffee", "breakfast"], "price":6.00}	},	"add": {		"doc": {"id":"19", "name":"Starbucks", "city":"Austin, TX", "type":"Coffee Shop", "state":"Texas", "tags":["coffee", "breakfast"], "price":5.00}	},	"add": {		"doc": {"id":"20", "name":"Starbucks", "city":"Greenville, SC", "type":"Coffee Shop", "state":"South Carolina", "tags":["coffee", "breakfast"], "price":3.00}	}}

    然后启动你的Tomcat,然后如图操作:

 

    提交后,执行查询,如图:

 as

   请注意Document Type选择项,如果你选择为JSON,那你将会收到这样一个异常,如图: 

    示例相关的配置以及测试数据,请看底下的附件,如果你们在操作过程中,遇到任何问题,请联系我,同时也欢迎各路Java高手加群一起交流学习,

   益达Q-Q:                7-3-6-0-3-1-3-0-5

 

   益达的Q-Q群:      1-0-5-0-9-8-8-0-6

 

 

   

 

 

   

转载地址:http://gcxbi.baihongyu.com/

你可能感兴趣的文章
忽略图片透明区域的事件(Flex)
查看>>
AS3 Flex基础知识100条
查看>>
Flex动态获取flash资源库文件
查看>>
flex中设置Label标签文字的自动换行
查看>>
Flex 中的元数据标签
查看>>
flex4 中创建自定义弹出窗口
查看>>
01Java基础语法-11. 数据类型之间的转换
查看>>
01Java基础语法-13. if分支语句的灵活使用
查看>>
01Java基础语法-15.for循环结构
查看>>
01Java基础语法-16. while循环结构
查看>>
01Java基础语法-17. do..while循环结构
查看>>
01Java基础语法-18. 各种循环语句的区别和应用场景
查看>>
01Java基础语法-19. 循环跳转控制语句
查看>>
Django框架全面讲解 -- Form
查看>>
socket,accept函数解析
查看>>
今日互联网关注(写在清明节后):每天都有值得关注的大变化
查看>>
”舍得“大法:把自己的优点当缺点倒出去
查看>>
[今日关注]鼓吹“互联网泡沫,到底为了什么”
查看>>
[互联网学习]如何提高网站的GooglePR值
查看>>
[关注大学生]求职不可不知——怎样的大学生不受欢迎
查看>>